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Preface 


Matrix theory is a fundamental area of mathematics with appli- 
cations not only to many branches of mathematics but also to sci- 
ence and engineering. Its connections to many different branches 
of mathematics include: (i) algebraic structures such as groups, 
fields, and vector spaces; (ii) combinatorics, including graphs and 
other discrete structures; and (iii) analysis, including systems of 
linear differential equations and functions of a matrix argument. 

Generally, elementary (and some advanced) books on matri- 
ces ignore or only touch on the combinatorial or graph-theoretical 
connections with matrices. This is unfortunate in that these con- 
nections can be used to shed light on the subject, and to clarify and 
deepen one’s understanding. In fact, a matrix and a (weighted) 
graph can each be regarded as different models of the same math- 
ematical concept. 

Most researchers in matrix theory, and most users of its meth- 
ods, are aware of the importance of graphs in linear algebra. This 
can be seen from the great number of papers in which graph- 
theoretic methods for solving problems in linear algebra are used. 
Also, electrical engineers apply these methods in practical work. 
But, in most instances, the graph is considered as an auxiliary, 
but nonetheless very useful, tool for solving important problems. 

This book differs from most other books on matrices in that 
the combinatorial, primarily graph-theoretic, tools are put in the 
forefront of the development of the theory. Graphs are used to 
explain and illuminate basic matrix constructions, formulas, com- 
putations, ideas, and results. Such an approach fosters a better 
understanding of many ideas of matrix theory and, in some in- 
stances, contributes to easier descriptions of them. The approach 


xi 
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taken in this book should be of interest to mathematicians, elec- 
trical engineers, and other specialists in sciences such as chemistry 
and physics. 

Each of us has written a previous book that is related to the 
present book: 


I. R. A. Brualdi, H. J. Ryser, Combinatorial Matrix Theory, 
Cambridge: Cambridge University Press, 1991; reprinted 
1992. 


II. D. Cvetkovié, Combinatorial Matrix Theory, with Applica- 
tions to Electrical Engineering, Chemistry and Physics, (in 
Serbian), Beograd: Naučna knjiga, 1980; 2nd ed. 1987. 


This joint book came about as a result of a proposal from the 
second-named author (D.C.) to the first-named author (R.A.B.) 
to join in reworking and translating (parts of) his book (II). While 
that book—mainly the theoretical parts of it—has been used as 
a guide in preparing this book, the material has been rewritten 
in a major way with some new organization and with substantial 
new material added throughout. The stress in this book is on 
the combinatorial aspects of the topics treated; other aspects of 
the theory (e.g., algebraic and analytic) are described as much as 
necessary for the book to be reasonably self-contained and to pro- 
vide some coherence. Some material that is rarely found in books 
at this level, for example, GerSgorin’s theorem and its extensions, 
Kronecker product of matrices, and sign-nonsingular matrices and 
evaluation of the permanent, is included in the book. 

Thus our goal in writing this book is to increase one’s under- 
standing of and intuition for the fundamentals of matrix theory, 
and its application to science, with the aid of combinatorial/graph- 
theoretic tools. The book is not written as a first course in linear 
algebra. It could be used in a special course in matrix theory for 
students who know the basics of vector spaces. More likely, this 
book could be used as a supplementary book for courses in matrix 
theory (or linear algebra). It could also be used as a book for an 
undergraduate seminar or as a book for self-study. 

We now briefly describe the chapters of the book. In the first 
chapter we review the basics and terminology of graph theory, 
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elementary counting formulas, fields, and vector spaces. It is ex- 
pected that someone reading this book has a previous acquain- 
tence with vector spaces. In Chapter 2 the algebra of matrices 
is explained, and the König digraph is introduced and then used 
in understanding and carrying out basic matrix operations. The 
short Chapter 3 is concerned with matrix powers and their de- 
scription in terms of another digraph associated with a matrix. 

In Chapter 4 we introduce the Coates digraph of a matrix and 
use it to give a graph-theoretic definition of the determinant. The 
fundamental properties of determinants are established using the 
Coates digraph. These include the Binet-Cauchy formula and the 
Laplace development of the determinant along a row or column. 
The classical formula for the determinant is also derived. Chapter 
5 is concerned with matrix inverses and a graph-theoretic interpre- 
tation is given. In Chapter 6 we develop the elementary theory of 
solutions of systems of linear equations, including Cramer’s rule, 
and show how the Coates digraph can be used to solve a linear 
system. Some brief mention is made of sparse matrices. 

In Chapter 7 we study the eigenvalues, eigenvectors, and char- 
acteristic polynomial of a matrix. We give a combinatorial argu- 
ment for the classical Cayley-Hamilton theorem and a very com- 
binatorial proof of the Jordan canonical form of a matrix. Chapter 
8 is about nonnegative matrices and their special properties that 
highly depend on their digraphs. We discuss, but do not prove, 
the important properties of nonnegative matrices that are part of 
the Perron-Frobenius theory. We also describe some basic proper- 
ties of graph spectra. There are three unrelated topics in Chapter 
9, namely, Kronecker products of matrices, eigenvalue inclusion 
regions, and the permanent of a matrix and its connection with 
sign-nonsingular matrices. In Chapter 10 we describe some appli- 
cations in electrical engineering, physics, and chemistry. 

Our hope is that this book will be useful for both students, 
teachers, and users of matrix theory. 


Richard A. Brualdi 


Dragoš Cvetković 


Dedication 


To Les and Carol Brualdi for keeping the family together 
Richard A. Brualdi 


To my grandchildren Nebojša and Katarina Cvetković 
Dragos Cvetkovié 


Chapter 1 


Introduction 


In this introductory chapter, we discuss ideas and results from 
combinatorics (especially graph theory) and algebra (fields and 
vector spaces) that will be used later. Analytical tools, as well as 
the elements of polynomial theory, which are sometimes used in 
this book, are not specifically mentioned or defined, believing, as 
we do, that the reader will be familiar with them. In accordance 
with the goals of this book, vector spaces are described in a very 
limited way. The emphasis of this book is on matrix theory and 
computation, and not on linear algebra in general. 


The first two sections are devoted to the basic concepts of graph 
theory. In Section 1.1 (undirected) graphs are introduced while 
Section 1.2 is concerned with digraphs (directed graphs). Section 
1.3 gives a short overview of permutations and combinations of 
finite sets, including their enumeration. The last two sections 
contain algebraic topics. Section 1.4 summarizes basic facts on 
fields while Section 1.5 reviews the basic structure of vector spaces 
of n-tuples over a field. 


Matrices, the main objects of study in this book, will be in- 
troduced in the next chapter. They act on vector spaces but, 
together with many algebraic properties, contain much combina- 
torial, in particular, graph-theoretical, structure. In this book we 
exploit these combinatorial properties of matrices to present and 
explain many of their basic features. 
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1.1 Graphs 


The basic notions of graph theory are very intuitive, and as a result 
we shall dispense with some formality in our explanations. Most 
of what follows consists of definitions and elementary properties. 


Definition 1.1.1 A graph G consists of a finite set V of elements 
called vertices and a set E of unordered pairs of vertices called 
edges. The order of the graph G is the number |V| of its vertices. 
If a = {x,y} is an edge, then a joins vertices x and y, and x and 
y are vertices of the edge a. If x = y, then a is a loop. A subgraph 
of G is a graph H with vertex set W C V whose edges are some, 
possibly all, of the edges of G joining vertices in W. The subgraph 
H is a induced subgraph of G provided each edge of G that joins 
vertices in W is also an edge of H. The subgraph H is a spanning 
subgraph of G provided W = V, that is, provided H contains all 
the vertices of G (but not necessarily all the edges). A multigraph 
differs from a graph in that there may be several edges joining the 
same two vertices. Thus the edges of a multigraph form a multiset 
of pairs of vertices. A weighted graph is a graph in which each edge 
has an assigned weight (generally, a real or complex number). If all 
the weights of a graph G are positive integers, then the weighted 
graph could be regarded as a multigraph G’ with the weight of an 
edge {x,y} in G regarded as the number of edges in G” joining the 
vertices x and y. 


G Hı H 
Figure 1.1 
Graphs can be pictured geometrically by representing each ver- 


tex by a (geometric) point in the plane, and each edge by a (geo- 
metric) edge, that is, a straight line or curve joining corresponding 
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geometric points. Care needs to be taken so that a geometric edge, 
except for its two endpoints, contains no other point representing 
a vertex of the graph. A graph G and two subgraphs Hı and Ha 
are drawn in Figure 1.1. The graph Hj is a spanning subgraph 
of G; the graph Ay is not a spanning subgraph but is an induced 
subgraph. 


Definition 1.1.2 Let G bea graph. A walk in G, joining vertices 
u and v, is a sequence y of vertices u = Tọ, £1,- -.,Lk-1; Ik = Y 
such that {x;,2;+1} is an edge for each i = 0,1,...,k —1. The 
edges of the walk y are these k edges, and the length of y is k. If 


u = v, then y is a closed walk. If the vertices £o, %1,..., 4-1, Lk 
are distinct, then y is a path joining u and v. If u = v and the 
vertices £o, £1,- .., £k—1, £k are otherwise distinct, then y is a cycle. 


The graph G is connected provided that for each pair of distinct 
vertices u and v there is a walk joining u and v. A graph that is 
not connected is called disconnected. 


It is to be noted that if there is a walk y joining vertices u and v, 
then there is a path joining u and v. Such a path can be obtained 
from y by eliminating cycles as they are formed in traversing y. 
A path has one fewer edge than it has vertices. The number of 
vertices of a cycle equals the number of its edges. We sometimes 
regard a path (respectively, cycle) as a graph whose vertices are 
the vertices on the path (respectively, cycle) and whose edges are 
the edges of the path (respectively, cycle). A path with n vertices 
is denoted by P,,, and a cycle with n vertices is denoted by Cn. 


Definition 1.1.3 Let G be a graph with vertex set V. Define 
u = v provided there is a walk joining u and v in G. Then it 
is easy to verify that this is an equivalence relation and thus V 
is partitioned into equivalence classes Vj, V2,...,V; whereby two 
vertices are joined by a walk in G if and only if they are in the 
same equivalence class. The subgraphs of G induced on the sets 
of vertices Vi, V2,...,V; are the connected components of G. The 
graph G is connected if and only if it has exactly one connected 
component. 
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A tree is a connected graph with no cycles. A spanning tree 
of G is a spanning subgraph of G that is a tree. Only connected 
graphs have a spanning tree, and a spanning tree can be obtained 
by recursively removing an edge of a cycle until no cycles remain. 
The graph in Figure 1.2 is a tree with 5 vertices and 4 edges. A 
forest is a graph each of whose connected components is a tree. 


Figure 1.2 


The next theorem contains some basic properties of trees. 


Theorem 1.1.4 Let G be a graph of order n > 2 without any 
loops. The following are equivalent: 


(i) G is a tree. 


(ii) For each pair of distinct vertices u and v there is a unique 
path joining u and v. 


(iii) G is connected and has exactly n — 1 edges. 


(iv) G is connected and removing an edge of G always results in 
a disconnected graph. 


An edge of a connected graph whose removal results in a dis- 
connected graph is called a bridge. A bridge cannot be an edge of 
any cycle. Property (iv) above thus asserts that a graph is a tree 
if and only if it is connected and every edge is a bridge. 

In Figure 1.3 we show all the structurally different trees of 
order k with k < 5. 
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Definition 1.1.5 In a graph G (or multigraph) the degree of a 
vertex u is the number of edges containing u where, in the case of a 
loop, there is a contribution of 2 to the degree. Let G be of order n, 
and let the degrees of its vertices be dı,da,...,d„, where, without 
loss of generality, we may assume that dı > da > --- > dn È 0. 
Then dı,ds,...,d, is the degree sequence of G. Since each edge 
contributes 1 to the degree of two vertices, or, in the case of loops, 
2 to the degree of one vertex, we have 


dı + d2 +--+ dn = 2e, 


where e is the number of edges. A graph is regular provided each 
vertex has the same degree. If k is the common degree, then the 
graph is regular of degree k. A connected regular graph of degree 
2 is a circuit. A pendent vertex of a graph is a vertex of degree 
1. The unique edge containing a particular pendent vertex is a 
pendent edge. 


‚IIIWY YV 


Figure 1.3 


The complete graph Kn of order n is the graph in which each 
pair of distinct vertices forms an edge. Thus K,, is a regular graph 
of degree n and has exactly n(n — 1)/2 edges. Since a tree of order 
n has n — 1 edges, the sum of the degrees of its vertices equals 
2(n — 1). Thus a tree of order n > 2 has at least two pendent 
vertices, and indeed has exactly 2 pendent vertices if and only if 
it is a path. Removing a pendent vertex—pendent edge pair from 
a tree leaves a tree of order 1 less. 
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Definition 1.1.6 A vertex-coloring of a graph is an assignment of 
a color to each vertex so that vertices that are joined by an edge 
are colored differently. One way to color a graph is to assign a 
different color to each vertex. The chromatic number of a graph G 
is the smallest number x(G) of colors needed to color its vertices. 
Oo 


The chromatic number of the complete graph Kn equals n. 
The chromatic number of a circuit is 2 if it has even length and 
is 3 if it has odd length. The chromatic number of a tree of order 
n > 2 equals 2. This latter fact follows easily by induction on the 
order of a tree, by removing a pendent vertex—pendent edge pair. 


Definition 1.1.7 A graph G is bipartite provided its chromatic 
number satisfies x(G) < 2. Only when G has no edges can the 
chromatic number of a bipartite graph be 1. Assume that G is 
a bipartite graph with vertex set V and at least one edge. Then 
V can be partitioned into two sets U and W such that each edge 
joins a vertex in U to a vertex in W. The pair U,W is called a 
bipartition of V (or of G). 


If G is a connected bipartite graph, its bipartition is unique. A 
tree is a bipartite graph. Bipartite graphs are usually drawn with 
one set of the bipartition on the left and the other on the right (or 
one on top and the other on the bottom); so edges go from left to 
right (or from top to bottom). An example of such a drawing of a 
bipartite graph is given in Figure 1.4. 


Figure 1.4 
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Let m and n be positive integers. The complete bipartite graph 
Km,n is the bipartite graph with vertex set V = U UW, where 
U contains m vertices and W contains n vertices and each pair 
{u, w} where u € U and w € W is an edge of Km n. Thus Km,n 
has exactly mn edges. 


Definition 1.1.8 Let G be a graph of order n. A matching M 
in G is a collection of edges no two of which have a vertex in 
common. If v is a vertex and there is an edge of M containing v, 
then v meets the matching M and the matching M meets the vertex 
v. A perfect matching of G, also called a 1-factor, is a matching 
that meets all vertices of G. The largest number of edges in a 
matching in G is the matching number m(G). If G has at least 
one edge, then 1 < m(G) < |n/2|. A matching with k edges is 
called a k-matching. 

A subset U of the vertices of G is a vertex-cover provided each 
edge of G has at least one of its vertices in U. The smallest number 
of vertices in a vertex-cover is the cover number c(G) of G. If G 
has at least one edge that is not a loop, then 1<c(G)<n-1. 


The complete bipartite graph Km,n has matching and covering 
number equal to min{m, n}. The complete graph K,, has a match- 
ing number equal to |n/2| and covering number equal to n — 1. 
The following theorem of König asserts that for bipartite graphs, 
the matching and covering numbers are equal. 


Theorem 1.1.9 Let G be a bipartite graph. Then m(G) = c(G), 
that is, the largest number of edges in a matching equals the small- 
est number of vertices in a vertex-cover. 


The notion of isomorphism of graphs is meant to make precise 
the statement that two graphs are structurally the same. 


Definition 1.1.10 Let G be a graph with vertex set V and let 
HT be a graph with vertex set W. An isomorphism from G to H 
is a bijection ¢: V — W such that {x,y} is an edge of G if and 
only if {¢(x), o(y)} is an edge of H. If & is an isomorphism from 
G to H, then clearly ¢-! : W — V is an isomorphism from H 
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to G. The graphs G and H are isomorphic provided there is an 
isomorphism from G to H (and thus one from H to G). The notion 
of isomorphism carries over to multigraphs by requiring that the 
edge {x,y} occur as many times in G as the edge {¢(x), d(y)} 
occurs in H. 


1.2 Digraphs 


In a graph, edges are unordered pairs of vertices and thus have no 
direction. In a directed graph, edges are ordered pairs of vertices 
and thus have a direction (or orientation) from the first vertex to 
the second vertex in the ordered pair. Most of the ideas introduced 
for graphs can be carried over to directed graphs, modified only 
to take into account the directions of the edges. As a result, we 
shall be somewhat brief. 


Definition 1.2.1 A directed graph (for short, a digraph) G con- 
sists of a finite set V of elements called vertices and a set E of 
ordered pairs of vertices called (directed) edges. The order of the 
digraph G is the number |V| of its vertices. If a = (x,y) is an 
edge, then x is the initial vertex of œa and y is the terminal ver- 
tex, and we say that a is an edge from x to y. In case x = y, 
a is a loop with initial and terminal vertices both equal to x. A 
multidigraph differs from a digraph in that there may be several 
edges with the same initial vertex and the same terminal vertex. 
A weighted digraph is a digraph in which each edge has an assigned 
weight. 


The notions of subgraph, spanning subgraph, and induced sub- 
graph of a graph carry over in the obvious way to subdigraph, span- 
ning subdigraph, and induced subdigraph of a digraph. Digraphs 
are pictured as graphs, except now the edges have arrows on them 
to indicate their direction. A digraph G with a spanning subdi- 
graph Hı and an induced subdigraph Hy are pictured in Figure 
1.5. 
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G A, Hə 


Figure 1.5 


In a digraph G, a vertex has two degrees. The outdegree d7 (v) 
of a vertex v is the number of edges of which v is an initial vertex; 
the indegree d”(v) of v is the number of edges of which v is 
a terminal vertex. A loop at a vertex contributes 1 to both its 
indegree and its outdegree. Clearly, the sum of the indegrees of 
the vertices of a digraph equals the sum of the outdegrees. 


Definition 1.2.2 Let G be a digraph. A walk in G from vertex u 
to vertex v is a sequence Y of vertices u = £o, £1,- --, Ek-1, Le = V 
such that (x;,2;+1) is an edge for each i = 0,1,...,k — 1. The 
edges of the walk y are these k edges and y has length k. In a 
closed walk, u = v. In a path, the vertices £o, £1,...,£k-1, Ik are 
distinct. If u = v and the vertices £o, £1, .-., &k—-1, £k are otherwise 
distinct, then the subdigraph consisting of the vertices and edges 
of y is a cycle. The digraph G is acyclic provided it has no cycles. 
If there is a walk from vertex u to vertex v, then there is a path 
from u to v. The digraph G is strongly connected provided that 
for each pair u and v of distinct vertices, there is a path from u to 
v and a path from v to u. 

Define u = v provided there is a walk from u to v and a walk 
from v to u. This is an equivalence relation and thus V is par- 
titioned into equivalence classes V1, V2,..., Vi. The l subdigraphs 
induced on the sets of vertices V1, V2,..., V; are the strong compo- 
nents of D. The digraph D is strongly connected if and only if it 
has exactly one strong component. 


The following theorem summarizes some important properties 
concerning these notions: 
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Theorem 1.2.3 Let G be a digraph with vertex set V. 


(i) Then G is strongly connected if and only if there does not 
exist a partition of V into two nonempty sets U and W such 
that all the edges between U and W have their initial vertex 
in U and their terminal vertex in W. 


(ii) The strong components of G can be ordered as G1,Ga,...,Gı 
so that if (x,y) is an edge of G with x in G; and y in G; with 
i Æ j, theni < j (in the ordering Gi, Go,...,G) all edges 
between the strong components go from left to right). 


Let G be a digraph (or multidigraph) with vertex set V. By 
replacing each directed edge (x,y) of G by an undirected edge 
{x,y} and deleting any duplicate edges, we obtain a graph G” 
called the underlying graph of G. The digraph G is called weakly 
connected provided its underlying graph G” is connected. The 
digraph G is called unilaterally connected provided that for each 
pair of distinct vertices u and v, there is a path from u to v or 
a path from v to u. A unilaterally connected digraph is clearly 
weakly connected. 

The notion of isomorphism of digraphs (and multidigraphs) is 
quite analogous to that of graphs. The only difference is that the 
direction of edges has to be taken into account. 


Definition 1.2.4 Let G be a digraph with vertex set V, and let 
H be a digraph with vertex set W. An isomorphism from G to 
H is a bijection ¢ : V — W such that (x,y) is an edge of G if 
and only if (ö(x),&(y)) is an edge of H. If & is an isomorphism 
from G to H, then ¢ 1: W — V is an isomorphism from H to 
G. The digraphs G and H are isomorphic provided there is an 
isomorphism from G to H (and thus one from H to G). 


1.3 Some Classical Combinatorics 


In this section we review the notions of permutations and combi- 
nations and corresponding basic counting formulas. 
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Definition 1.3.1 Let X be a set with n elements that, for ease 
of description, we can assume to be the set {1,2,...,n} consisting 
of the first n positive integers. A permutation of X is a listing 
iii2...in of the elements of X in some order. There are n! = 
n(n—1)(n—2)---1 permutations of X. The permutation iriz... in 
can be regarded as a bijection o : X — X from X to X by defining 
olk) = ip for k = 1,2,...,n. 

Now let r be a nonnegative integer with 1 < r < n. An r- 
permutation of X is a listing iiis... ti, of r of the elements of X in 
some order. There are n(n — 1) --- (n — r + 1) r-permutations of 
X, and this number can be written as n!/(n—r)!. (Here we adopt 
the convention that 0! = 1 to allow for the case that r = n in the 
formula.) 

An r-combination of X is a selection of r of the objects of X 
without regard for order. Thus an r-combination of X is just a 
subset of X with r elements. Each r-combination can be ordered 
in r! ways, and in this way we obtain all the r-permutations of X. 
Thus the number of r-combinations of X equals 


n! 
rn r)! 


a number we denote by C) (read as n choose r). 


For instance, 


(") = tae) (0<r<n), 


since the complement of an r-combination is a (n—r)-combination. 
The number of combinations (of any size) of the set {1,2,...,n} 
equals 2”, since each integer in the set can be chosen or left out of 
a combination. Counting combinations by size k = 0,1, 2,...,n, 
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we thus get the identity 


= a = 2", 
k=0 

The above formulas hold for permutations and combinations 
in which one is not allowed to repeat an object. If we are allowed 
to repeat objects in a permutation, then more general formulas 
hold. The number of permutations of X = {1,2,...,n} in which, 
for each k = 1,2,...,n, the integer k appears m; times equals 


(my +M +--+ mM)! 
m!mo!---m,! 


This follows by observing that such a permutation is a list of length 
N=m,+m2+---+ m,„, and to form such a list we choose mı 
places for the 1’s, mz of the remaining places for the 2’s, ma of the 
remaining places for the 3’s, and so forth, giving 


ores 


After substitution and cancellation, this reduces to the given 
formula. The number of r-permutations i,22...i, of X = 
{1,2,...,n}, where the number of times each integer in X can 
be repeated is not restricted, that is, can occur any number of 
times (sometimes called an r-permutation-with-repetition), of X is 
n”, since there are n choices for each of the r integers iz. 

For r-combinations of X = {1,2,...,n} in which the number 
of times an integer occurs is not restricted (other than by the size 
r of the combination), we have to choose how many times (denote 
it by z) each integer k occurs in the r-combination. Thus the 
number of such r-combinations equals the number of solutions in 
nonnegative integers of the equation 


Titt + Hn =r. 


This is the same as the number of permutations of the two integers 
0 and 1 in which 1 occurs r times and 0 occurs n — 1 times (the 
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number of 1’s to the left of the first 0, in between the 0’s, and to 
the right of the last 0 give the values of £1, 272,...,%,). Thus the 
number of such r-combinations equals 


Grr)! /n+r-=-1\_ [/n+r-1 
ri(n—1)! r o Nns- J 
Another useful counting technique is provided by the inclusion- 
exclusion formula. Let X1,Xa,...,X„ be subsets of a finite set 


U. Then the number of elements of U in none of the sets 
X 1, X2, ..., Xn is given by 


KNXan- NR, = A1 5 |Niex Xil. 
k=0 KC{1,2,...n})|K|=k 


Here X; is the complement of X; in U, that is, the subset of ele- 
ments of U that are not in X;. For the value k = 0 in the formula, 
we have K =, and N;coX,; is an intersection over an empty set 
and is interpreted as U. 

The set of n! permutations of {1,2,...,n} can be naturally 
partitioned into two sets of the same cardinality using properties 
called evenness and oddness. These properties and the resulting 
partition are discussed in Chapter 4. 


1.4 Fields 


The number systems with which we work in this book are primarily 
the real number system R and the complex number system C. But 
much of what we develop does not use any special properties of 
these familiar number systems, and works for any number system 
called a field. We give a working definition of a field since it is not 
in our interest to systematically develop properties of fields. 


‘One notable exception is that polynomials of degree at least 1 with com- 
plex coefficients (in particular, polynomials with real coefficients) always have 
roots (real or complex). In fact a polynomial of degree n > 1 with complex 
coefficients can be completely factored in the form c(a—11)(w—1r2)-+-++(@—Tn), 
where c,rı,ra2,...,r„ are complex numbers. This property of complex num- 
bers is expressed by saying that the complex numbers are algebraicaly closed. 
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Definition 1.4.1 Let F be a set on which two binary operations? 
are defined, called addition and multiplication, respectively, and 
denoted as usual by “+” and “-”. Then F is a field provided the 
following properites hold: 


(i) 


(associative law for addition) a+ (b + c) = (a+b) + c. 
(ii) (commutative law for addition) a+ b= b + a. 


(iii) (zero element) There is an element 0 in F such that a+0 = 
O+a=a. 


(iv) (additive inverses) Corresponding to each element a, there is 
an element a’ in F such that a+a’ = a'+a = 0. The element 
a’ is usually denoted by —a. Thus a+ (—a) = (—a) +a = 0. 


(v) (associative law for multiplication) a - (b - c) = (a - b) - c. 
(vi) (commutative law for multiplication) a - b = b - a. 


(vii) (identity element) There exists an element 1 in F diferent 
from 0 such that 1-a =a:1 =a. 


(viii) (multiplicative inverses) Corresponding to each element a # 
0, there is an element a” in F such that a - a” = a" -a=1. 
The element a” is usually denoted by a~'. Thus a a`! = 


a a 
(ix) (distributive laws) a-(b+c) = a-b+a-c and (b+c)-a = b-a+c-a. 


It is understood that the above properties are to hold for all choices 
of the elements a,b, and cin F. Note that properties (i)-(iv) in- 
volve only addition and properties (v)-(viii) involve only multipli- 
cation. The distributive laws connect the two binary operations 
and make them dependent on one another. We often drop the 
multiplication symbol and write ab in place of a-b. Thus, for 
instance, the associative law (v) becomes a(bc) = (ab)c. 


?A binary operation on F means that given an ordered pair a, b of elements 
in F, they can be combined using the operation to produce another element 
in F. This is sometimes expressed by saying that the operation of combining 
two elements satisfies the closure property. 
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Examples of fields are (a) the set R of real numbers with the 
usual addition and multiplication, (b) the set C of complex num- 
bers with the usual addition and multiplication, and (c) the set 
O of rational numbers with the usual addition and multiplication. 
A familiar number systen that is not a field is the set of integers 
with the usual addition and multiplication (e.g., 2 does not have 
a multiplicative inverse). 

Properties (i), (iii), and (iv) are the defining properties for an 
algebraic system with one binary operation, denoted here by +, 
called a group. If property (ii) also holds then we have a com- 
mutative group. By properties (v)—(viii) the nonzero elements of 
a field form a commutative group under the binary operation of 
multiplication. 

In the next theorem we collect a number of elementary prop- 
erties of fields whose proofs are straightforward. 


Theorem 1.4.2 Let F be a field. Then the following hold: 
(i) The zero element 0 and identity element 1 are unique. 
(ii) The additive inverse of an element of F is unique. 


(iii) The multiplicative inverse of a nonzero element of F is 
unique. 


(iv) a-0=0-a=0 for alla in F. 


(vi) (a-')-1 =a for all nonzero a in F. 


) 

(v) -(-a) =a for alla in F. 
) 
) 


(cancellation laws) Ifa-b = 0, thena = 0 orb=0. If 
a-b=a-canda#0, thenb=c. 


(vii 


We now show how one can construct fields with a finite number 
of elements. Let m be a positive integer. First we recall the 
division algorithm, which asserts that if a is any integer, there 
are unique integers q (the quotient) and r (the remainder), with 
0<r<m-l, such that a = qm-+r. For integers a and b, define a 
to be congruent modulo m to b, denoted a = b (mod m), provided 
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m is a divisor of a — b. Congruence modulo m is an equivalence 
relation, and as a result the set Z of integers is partitioned into 
equivalence classes. The equivalence class containing a is denoted 
by [a]m. Thus [alm = [b]m if and only if m is a divisor of a — b. 

It follows easily that a = b (mod m) if and only if a and b 
have the same remainder when divided by m. Thus there is a 
one-to-one correspondence between equivalence classes modulo m 
and the possible remainders 0,1,2,...,m — 1 when an integer is 
divided by m. We can thus identify the equivalence classes with 
0,1,2,...,m—1. Congruence satisfies a basic property with regard 
to addition and mutltiplication that is easily verified: 


If a=b (mod m) and c=d (mod m), then 


a+c=b+d(mod m) and ac = bd (mod m). 


This property allows one to add and multiply equivalence classes 
unambiguously as follows: 


la]m + [ölm = [a + blm and [a]m : [b]m = lablm- 


Let Zm = {0,1,2,...,m—1}. Then Zm contains exactly one ele- 
ment from each equivalence class, and we can regard addition and 
multiplication of equivalence classes as addition and multiplica- 
tion of integers in Z,,. For instance, let m = 9. Then, examples 
of addition and multiplication in Zy are 


44+3=7 and6+7=4 


5+0=5 and1-6=6 
4-8=5 and7-4=1 


If m is a prime number, then, as shown in the next theorem, Zm 
is actually a field. To prove this, we require another basic property 
of integers, namely, that if a and m are integers whose greatest 
common divisor is d, then there are integers s and t expressing d 
as a linear integer combination of a and m: 


d= sa + tm. 
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Theorem 1.4.3 Letm be a prime number. With the addition and 
multiplication as defined above, Zm is a field. 


Proof. Most of the proof of this theorem is routine. It is 
clear that 0 € Zm and 1 € Zm are the zero element and identity 
element. If a € Zm and a Æ 0, then m — a is the additive inverse 
of a. If a € Zm and a Æ 0, then the greatest common divisor 
of a and m is 1, and hence there exist integers s and t such that 
sa+tm=1. Thus sa = 1 — tm is congruent to 1 modulo m. Let 
s* be the integer in Zm congruent to s modulo m. Then we also 
have s*a = 1 mod m. Hence s* is the multiplicative inverse of a 
modulo m. Verification of the rest of the field properties is now 
routine. 


As an example, let m = 7. Then Z- is a field with 


2-4=1 sothat 27! = 4 and 4™! = 2; 
3-5=1 sothat 37t = 5 and 57! = 3; 
6-6=1 so that 67! = 6. 


Two fields F and F’ are isomorphic provided there is a bijection 
o:F — F’ that preserves both addition and multiplication: 


pla +b) = (a) + (b), and o(a- b) = dla) - lb). 


In these equations the leftmost binary operations (addition and 
multiplication, respectively) are those of F and the rightmost are 
those of F’. It is a fundamental fact that any two fields with the 
same finite number of elements are isomorphic. 


1.5 Vector Spaces 


There is an important, abstract notion of a vector space over a 
field that does not have to concern us here. We shall confine our 
attention to the vector space F” of n-tuples over a field F, whose 
elements are called vectors, that is, 


F” = { (a1, a2, ..- , an) :a; © F,i = 1,2,... n}. 
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The zero vector is the n-tuple (0,0,...,0), where 0 is the zero 
element of F. As usual, the zero vector is also denoted by 0 with 
the context determining whether the zero element of F or the zero 
vector is intended. The elements of F are now called scalars. 

Using the addition and multiplication of the field F, vectors 
can be added componentwise and multiplied by scalars. Let u = 
(a1, @2,...,@n) and v = (by, bo,...,b,) bein F”. Then 


u +v = (a, + by, a2 + be,..., an + bn). 
If cis in F, then 
cu = (ca1, Ca9, . . . , Can). 


Since vector addition and scalar multiplication are defined in terms 
of addition and multiplication in F that satisfy certain associa- 
tive, commutative, and distributive laws, we obtain associative, 
commutative, and distributive laws for vector addition and scalar 
multiplication. These laws are quite transparent from those for F', 
and we only mention the following: 

(i) u+0=0+4+u = u for all vectors u. 
(ii) Ow = u0 = 0 for all vectors u. 


(iii) u +v = v + u for all vectors u and v. 


(v 


(vi 


c(u + v) = cu + cv for all vectors u and v and scalars c. 


) 
) 
) 
(iv) (c+ d)u = cu + du for all vectors u and scalars c and d. 
) 
) lu = u for all vectors u. 

) 


(—1)u = (—u1, —U2,...,—Un) for all vectors u; this vector is 
denoted by —u and is called the negative of u and satisfies 
u + (—u) = (—u) + u = 0, for all vectors u. 


(vii 


(viii) c(du) = (cd)u for all vectors u and scalars c and d. 


A fundamental notion is that of a subspace of F”. Let V be a 
nonempty subset of F”. Then V is a subspace of F” provided V 
is closed under vector addition and scalar multiplication, that is, 
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(a) For all u and v in V, u +v is also in V. 
(b) For all u in V and c in F, cu is in V. 


Let u be in the subspace V. Because Ou = 0, it follows that 
the zero vector is in V. Similarly, —u is in V for all u in V. 
A simple example of a subspace of F” is the set of all vectors 
(0,a2,...,an) with first coordinate equal to 0. The zero vector 
itself is a subspace. 


Definition 1.5.1 Let um, u®,..., u”) be vectors in F”, and let 
C1, C2, ...,Cm be scalars. Then the vector 


cu + eu +--+ + mu) 


is called a linear combination of uw, u®,...,u0™. If V is a sub- 
space of F”, then V is closed under vector addition and scalar 
multiplication, and it follows easily by induction that a linear com- 
bination of vectors in V is also a vector in V. Thus subspaces are 
closed under linear combinations, in fact, this can be taken as 
the defining property of subspaces. The vectors ud, u®,... u) 
span V (equivalently, form a spanning set of V) provided every 
vector in V is a linear combination of u®, u®,...,u™. The zero 
vector can be written as a linear combination of ud u), ..., u”) 
with all scalars equal to 0; this is a trivial linear combination. The 
vectors uu), ...,u™ are linearly dependent provided there are 
scalars C1, C2,...,C€m, not all of which are zero, such that 


cu + equ?) +... + eu =0, 


that is, the zero vector can be written as a nontrivial lin- 
ear combination of u”, u®,..., u). For example, the vectors 
(1, 4), (3, -1), and (3,5) in R? are linearly dependent since 


3(1, 4) + 1(3, —2) — 2(3,5) = (0,0). 


Vectors are linearly independent provided they are not linearly 
dependent. The vectors um, u®,..., u”) are a basis of V provided 
they are linearly independent and span V. By an ordered basis 
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we mean a basis in which the vectors of the basis are listed in 
a specified order; to indicate that we have an ordered basis we 
write (u, u?,...,u0). A spanning set S of V is a minimal 
spanning set of V provided that each set of vectors obtained from 
S by removing a vector is not a spanning set for V. A linearly 
independent set S of vectors of V is a maximal linearly independent 
set of vectors of V provided that for each vector w of V that is not 
in S, SU {w} is linearly dependent (when this happens, w must 
be a linear combination of the vectors in S). 


In the next theorem, we collect some basic facts about these 
properties. 


Theorem 1.5.2 Let V be a subspace of F”. 


(i) Then V has a basis and any two bases of V contain the same 
number of vectors. 


(ii) A minimal spanning set of V is a basis of V. Thus every 
spanning set of vectors contains a basis of V. 


(iii) A maximal linearly independent set of vectors of V is a basis 
of V. Thus every linearly independent set of vectors can be 
extended to a basis of V. 


(iv) If (u,u®,...,u0™) is an ordered basis of V, then each 
vector u in v can be written uniquely as a linear combination 
of these vectors: u = cu + cpu?) +- --+cmu™ , where the 
scalars C1, C2,.--,Cm are uniquely determined. 


The number of vectors in a basis of a subspace V and so, by 
(i) of Theorem 1.5.2, the number of vectors in every basis of V, is 
the dimension of V, denoted by dim V. In (iv) of Theorem 1.5.2, 
the scalars c1, C2,...,Cn are the coordinates of u with respect to 
the ordered basis (u“, u, ..., u”). 


Definition 1.5.3 Let U be a subspace of F” and let V be a 
subspace of F”. A mapping T:U—V is a linear transformation 
provided 

T(cu+ dw) = cT (u) + dT(v) 
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for all vectors u and w in U and all scalars c and d. The kernel of 
the linear transformation T is the set 

ker (T) = {u € U : T(u) = 0} 


of all vectors of U that are mapped to the zero vector of V. The 
linear transformation T is an injective linear transformation if and 
only if ker (T) = {0}. The range of T is the set 


range (T) = {T (u):u € U} 


of all values (images) of vectors in U. It follows by induction from 
the definition of a linear transformation that linear transforma- 
tions preserve all linear combinations, that is, 


for all vectors u®, u®,...,u® and all scalars c1, C2, ... , Cp- 


Finally, we review the notion of the dot product of vectors in 
R” and C”. 


Definition 1.5.4 Let u = (a1, a2, ..., an) and v = (by, b2,..., bn) 
be vectors in either R, or C”. Then their dot product u-v is defined 
by 


(i) u -v = abı + a2b2 +--+ Gnbn, u,v E€ R”; 


(ii) u: v = ab, + a2b2 +--+ anbn, u,v E C”. 


Here b denotes the complex conjugate? of b. In particular, we have 
that 


u: u = aū + A202 ++ Anm = Jaıl?| laa|? re lan? > 0 


with equality if and only if u is a zero vector. The norm (or length) 
||u|| of a vector u is defined by 


all = vu: u. 


3Recall that a +b =T +b and ab = Tb. 
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The next theorem contains some elementary properties of dot 
products and norms. 


Theorem 1.5.5 Let u,v, and w be vectors in R” or C”. Then 
the following hold: 


(i) [lul] > 0 with equality if and only if u = 0. 
(ii) u+vV)-w=u-w+tv-wandu-(v+w)=u-v+u-w. 


(ii) cu -v = c(u - v) andu-cv = Cu - v (so ifc is a real scalar, 
u-cu=c(u-v)). 


(iv) u-v = vu (so if u and v are real vectors. u-v =v: u) 


(v) (Cauchy-Schwarz inequality) |u -v| < ||ull||v|| with equality 
if and only if u and v are linearly dependent. 


Let u and v be nonzero vectors in R”. By the Cauchy-Schwarz 


inequality, 
u-v 


a ~ 
Hence there is an angle 0 with 0 < 0 < m such that 


cos ĝ = 


The angle 0 is the angle between the vectors u and v. It follows 
that u -v = 0 if and only if 0 = 7/2, in which case u and v are 
orthogonal. The zero vector is orthogonal to every vector. For 
vectors u and v in C”, we also say that u and v are orthogonal 
if u-v = 0. Mutually orthogonal, nonzero vectors are linearly 
independent. In particular, n mutually orthogonal, nonzero vec- 
tors u1, U2, ..., Un Of R” or C” form a basis. If, in addition, each 
of the vectors u1, U2, ...,Un has unit length (which can always be 
achieved by multiplying each u; by 1/||w;||), then u1, U2, ..., Un is 
an orthonormal basis. 

Now let v1, U2,...,Um be an arbitrary basis of a subspace V 
of R” or of C”. The Gram-Schmidt orthogonalization algorithm 
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determines an orthonormal basis u1, U2,..., Um with the property 
that the subspace spanned by v1, v2,...,v% equals the subspace 
spanned by u1, u2,..., Uk for each k = 1,2,...,m. After first 
normalizing vı to obtain a vector uy = vı/||\vı|| of unit length, 
the algorithm proceeds recursively by orthogonally projecting v;+1 
onto the subspace V; spanned by uı,..., u; (equivalently, the sub- 
space spanned by vı,...,v;), forming the difference vector that is 
orthogonal to this subspace V;, and then normalizing this vector 
to have length 1. Algebraically, we have 


aoe Vi41 — Projy, (vi41) o v7 Djv Uy) Us 
i vr = Projy wall luen — haa (uy all 


for i = 1,2,...,m — 1. 

For each 9 with 0 < 0 < m, the vector (cos0, sin 9)” and the 
vector (— sin 0, cos 0)? form an orthonormal basis of R?; this is 
the orthonormal basis obtained by rotating the standard basis 
(1,0), (0,1) by an angle @ in the counterclockwise direction. 


In this first chapter, we have given a very brief introduction to 
elementary graph theory, combinatorics, and linear algebra. For 
more about these areas of mathematics, and indeed for many of 
the topics discussed in the next chapters, one may consult the 
extensive material given in the handbooks Handbook of Discrete 
and Combinatorial Mathematics |68], Handbook of Graph Theory 
[39], and Handbook of Linear Algebra [46]. 


1.6 Exercises 


1. Prove Theorem 1.1.4. 
2. List the structurally different trees of order 6. 


3. Prove that there does not exist a regular graph of degree k 
with n vertices if both n and k are odd. 


4. Determine the chromatic numbers of the following graphs: 
(a) the graph obtained from K,, by removing an edge; (b) the 
graph obtained from K,, by removing two edges (there are 
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two possibilities: the removed edges may or may not have a 
vertex in common); (c) the graph obtained from a tree by 
adding a new edge (the new edge may create either a cycle 
of even length or a cycle of odd length). 


. Let G be the bipartite graph with bipartition 


U = [u1, ua, Us, U4, Us, ug} and W = {wı, wa, W3, W4, Ws, We} 


whose edges are all those pairs {u;, wj} for which 2i + 35 is 
congruent to 0, 1, or 5 modulo 6. Draw the graph G and 
determine a matching with the largest number of edges and 
a vertex-cover with the smallest number of vertices. 


. Let the digraph G be obtained from the complete graph Kn 


by giving a direction to each edge. (Such a digraph is usually 
called a tournament.) Let dj,d3,...,d;5 be the outdegrees 
of G in some order. Prove that 


dt +d} +- tdi < () NH DV Vene) 


with equality for k = n. 


. Let D be the digraph with vertex set {1,2,3,4,5,6,7,8}, 


where there is an edge from 7 to j if and only if 22 + 37 is 
congruent to 1 or 4 modulo 8. Determine whether or not D 
is strongly connected. 


. Use the inclusion-exclusion formula to show that the number 


of permutations ijia...in of {1,2,...,n} such that ip 4 k 
for k = 1,2,...,n is given by 


| 
A 


. Prove that the number of even (respectively, odd) combina- 


tions of {1,2,...,n} equals 2””1. (By an even combination 
we understand a combination with an even number of ele- 
ments; an odd combination has an odd number of elements. ) 


1.6. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


17. 
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Determine the number of solutions in nonnegative integers 


of 


£i + £2 + £3 + £4 + £5 = 24, 


where x; > 2 and z; > 3. 


Write out the complete addition and multiplication tables 
for the field Z5. 


Prove Theorem 1.4.2. 


Show that 101°°° = 1 (mod 100) and that 99°' = 
—1 (mod 100). 
Let V be the set of all vectors (a1, @2,...,@,) in F” such 


that aj + a2 +... + an = 0. Prove that V is a subspace of 
F” and find a basis of V. 


Let u, u®,...,u) be an orthonormal basis of R”. Prove 
that if u is a vector in R”, then 


Prove Theorem 1.5.5. 


Show that (1,0,0), (1,1,0), (1,1,1) is a basis of R? and use 
the Gram-Schmidt orthogonalization algorithm to obtain an 
orthonormal basis. 


Chapter 2 


Basic Matrix Operations 


In this chapter we introduce matrices as arrays of numbers and 
define their basic algebraic operations: sum, product, and trans- 
position. Next, we associate to a matrix a digraph called the König 
digraph and establish connections of matrix operations with cer- 
tain operations on graphs. These graph-theoretic operations il- 
luminate the matrix operations and aid in understanding their 
properties. In particular, we use the König digraph to explain 
how matrices can be partitioned into blocks in order to facilitate 
matrix operations. 


2.1 Basic Concepts 


Let m and n be positive integers.! A matrix is an m by n rectan- 
gular array of numbers? of the form 


Qil 12 Amn 
Q21 Q22 ° Gan 

Ae est a Se lh (2.1) 
Ami Am2 ‘°° Amn 


1There will be occasions later when we will want to allow m and n to be 
0, resulting in empty matrices in the definition. 

?These may be real numbers, complex numbers, or numbers from some 
other arithmetic system, such as the integers modulo n. 
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The matrix A has size m by n and we often say that its type is 
mxn. The mn numbers a;; are called the entries or (elements) of 
the matrix A. If m = n, then A is a square matrix, and instead of 
saying A has size n by n we usually say that A is a square matrix 
of order n. 

The matrix A in (2.1) has m rows of the form 


ai= | aa aig -*° ey. Ik Bel. 


and n columns 


Amj 


The entry a,j; contained in both a; and ß;, that is, the entry at the 
intersection of row į and column J, is the (i, j)-entry of A. The 
rows a; are 1 by n matrices, or row vectors; the columns (3; are m 
by 1 matrices, or column vectors. For brevity we denote the m by 
n matrix A by 
A = [dij]mn 

and usually more simply as [a;;] if the size is understood. 

Two matrices A = [a;;| and B = [b;,] are equal matrices pro- 
vided that they have the same size m by n and corresponding 
entries are equal: 


aij = bij, GS 12.2, ea): 


Thus, for instance, a2 by 3 matrix can never equal a 3 by 2 matrix, 


and 
205 2 15 
e 
because the (1, 2)-entries of these matrices are not equal. 
Addition and subtraction of matrices are defined in a natu- 
ral way by the addition and subtraction of corresponding entries. 


More precisely, if A = [a;;| and B = [b;,] are matrices of the same 
size m by n, then their matrix sum is the m by n matrix 


A+ B= [aij + bij] 
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and their matrix difference is the m by n matrix 
A-B= laij = biz]. 
In both cases we perform the operation entrywise. 


Example 2.1.1 


1 —3 BAER RE 
4 2-3 — 5 6) 1373 


Two matrices of different sizes can never be added or subtracted. 


The multiplication of matrices is more complicated and, as a 
result, more interesting and, as we shall see, important and useful. 
First of all, the multiplication A- B, or simply AB, of two matrices 
A and B is possible if and only if the number of columns of A equals 
the number of rows of B. So let A = [a;;|m,n and B = |b;;|n,p- Then 
the matrix product A - B (more simply, AB) is the m by p matrix 
C = [cj], where 


Cij = 4101; Haizbaj +: 5 “On Ong (i = 1, 2, vr m; j = 1, 2; EUR JD) 


Thus the (i, 7)-entry of AB is determined only by the ith row 
of A and the jth column of B. 
Example 2.1.2 We have the matrix product 
Fe los 5 1 -7 11 -16 
4 2 —3 4 0 1 6 -8 14 13 -12 


Here, for instance, 1 = 4 -5 + 2 - (—2) + (—3) -1 = 20 — 4 — 3. 


There are some important observations to be made here. First, 
even though the product AB is defined (because the number of 
columns of A equals the number of rows of B), the product BA 
may not be defined (because the number of columns of B may not 
equal the number of rows of A). In fact, if A is m by n, then both 
AB and BA are defined if and only if B is n by m. In particular, 
if A and B are square matrices of the same order n, then both AB 
and BA are defined. But they need not be equal matrices. 
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Example 2.1.3 We have 
0 1 10| J00 
0 0 ey) NO 20 ie? 
1 0 ea jhe JO a 
0 0 00) Jooj’ 


Notice also that we have, in this case, an instance of matrix mul- 
tiplication where BA = A. 


while 


In addition to matrix addition, subtraction, and multiplication, 
there is one additional operation that we define now. It’s perhaps 
the simplest of them all. Let A = [a;;| be an m by n matrix and 
let c be a number. Then the matrix c- A, or simply cA, is the m 
by n matrix obtained by multiplying each entry of A by c: 


cA = [ca;j]. 


The matrix cA is called a scalar multiple of A. 
Matrix transposition is an operation defined on one matrix by 
interchanging rows with columns in the following way. Let 


aıı Q12 Qin 

Q21 Q22 Q2n 
A= . 

Amı Am2 *** Amn 


be anm by n matrix. Then the transpose of the matrix A is the 
n by m matrix 


aıı Q21 ++) Ami 

a12 Q22 *** Am2 
AT = 

Ain Gn *** Amn 


Matrix transposition satisfies the following properties (where 
the matrices are assumed to be of the appropriate sizes so that 
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the operations can be carried out), and these properties can be 
verified in a straightforward manner: 


(AT) = A (transposition is an involutory operation), 
(A+ B)’ = A’ + B” (transposition commutes with addition), 
(cA)? = cA? 
(transposition commutes with scalar multiplication). 
Elementary, but not as straightforward, is the relation 
ABER Ar. 
(transposition “anticommutes” with multiplication). 


This relationship can be verified by observing that the entry in 
position (i,j) of (AB)? (so the (j,i)-entry of AB) is obtained 
from the jth row of A and the ith column of B as prescribed by 
matrix multiplication, while the entry in position (i, j) of BTAT 
is obtained from the ith row of BT (so the ith column of B) and 
the jth column of AT (so the jth row of A), again as prescribed 
by matrix multiplication. In the next section, we give a graph- 
theoretic viewpoint of the relation (AB)? = BT A’. 

To conclude this section we define some special matrices that 
are very useful. A zero matrix is a matrix each of whose entries 
equals 0. A zero matrix of size m by n is denoted by Oj». We 
often simply write O with the size of the matrix being understood 
from the context. An identity matrix (or unit matrix) is a square 
matrix A = [a,;] such that a; = 1 for all i and a; = 0 if i £ j. 
An identity matrix of order n is denoted by J,,, and we often write 
I with the order being understood from the context. Thus the 
identity matrix of order n is the matrix J,, = [6;;|, where ô is the 
so-called Kronecker ö-symbol defined by 


5 [bL i=j, 
‘710, ift FZ. 


Example 2.1.4 The identity matrix of order 3 is 


1 0 0 
I=|:0.1 0 
0 0 1 
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For matrix addition, zero matrices act like the number 0 acts 
for ordinary addition. For matrix multiplication, identity matri- 
ces act like the number 1 acts for ordinary multiplication. These 
properties are expressed more precisely in the following equations, 
which are easily verified: 


O+A=A+O=A (2.2) 
if A and O have the same size, and 
ImA = A and BI, = B, (2.3) 


if A has m rows and B has n columns. 

The main diagonal or simply diagonal, of a square matrix A = 
[a;;] of order n consists of the n entries a11,@22,:. . ‚Ann. We also 
refer to the n positions of these n entries of A as the main diagonal 
of A, and we refer to the remaining positions of A as the off- 
diagonal of A. A square matrix is a diagonal matrix provided 
each off-diagonal entry of A equals 0. 


Example 2.1.5 The matrix 


3 0 0) 
ve 
me 


is a diagonal matrix. Identity matrices as well as square zero 
matrices are diagonal matrices. 


A diagonal matrix with diagonal entries d,, ds, : -, dp is some- 
times denoted by 


D= diag(dı, də, Be dn). 


If di = da = --- = dp, with the common value equal to d, then 
D = dl, and D is called a scalar matriz. A square matrix 
is an upper triangular matrix provided all its entries below the 
main diagonal equal zero (thus the nonzero entries are confined 
to those positions on and above the main diagonal). A lower 
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triangular matrix is defined in an analogous way. Note that if A 
is a square matrix, then A is upper triangular if and only if AT is 
lower triangular. 

A permutation matrix P = |p;;| of order m is a square ma- 
trix that has exactly one 1 in each row and column and 0’s else- 
where. Thus a permutation matrix of order m has exactly m 
nonzero entries and each of these m entries equals 1. Permutation 
matrices correspond to permutations in the following way: Let 
o = kıka...km be a permutation of {1,2,...,m}. Let P = [p,,| 
be the square matrix of order m defined by 


1, ifj = ki, 
Pij =| 0, 


otherwise. 


Then P is a permutation matrix and every permutation matrix of 
order m corresponds to a permutation of {1,2,...,m} in this way. 
If A is an m by n matrix, then PA is obtained by permuting the 
rows of A so that in PA the rows of A are in the order: row ky, 
row ka, ..., row km. 


Example 2.1.6 If o = 3124, then 


oS oF © 
o.oo 
= OVO 
=. OC O&O 


The definition of a permutation matrix treats rows and 
columns in the same way. Thus the transpose of a permutation 
matrix is a permutation matrix, and we have from the properties 
of transposition that 


(PA)? = AT P". 


It thus follows that to permute the columns of an m by n matrix 
so that they occur in the order l4, /2,---,l,, we multiply A on the 
right by the permutation matrix QT, where Q is the permutation 
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matrix corresponding to the permutation lil- - -lp of {1,2,...,n}. 
In particular, if A is a square matrix of order n, then the matrix 
QAQT is obtained from A by permuting the rows to put them in 
the order lı, lo,...,Jn and permuting the columns to put them in 
the order l, lz, ..., ln. The matrix QAQ? is obtained from A by 
simultaneous permutations of its rows and columns. 


Example 2.1.7 Let A = |a;;| be a general matrix of order 4. Let 
P be the permutation matrix corresponding to the permutation 
2431 of {1,2,3,4}. Then 


0100 ] | Q11 Q12 Q13 Q14 ] | 0001 ] 
PAPT = 000 1 Q21 Q22 Q23 Q24 1000 

0 0 1 0 a31 432 Q33 A34 0 0 1 0 

1000 | | M41 Q42 Q43 M44 | | 0 1 0 0 | 


Note that the main diagonal entries of A occur in the order 2, 4, 3, 1 
on the main diagonal of PAP’. 


Finally, let A and B be matrices of sizes m by n and p by q, 
respectively. Then the direct sum of A with B is the m+ p by 
n + q matrix given by 

o| A Ong 
A@®B= | On B | : 


In case A and B are square matrices, so is their direct sum. The 
direct sum of more than two matrices is defined in the obvious 
way. 

In the next section, we introduce the König digraph of a matrix 
that illuminates much of our discussion in this section. 
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2.2 The Konig Digraph of a Matrix 


Let A = |a;;] be an m by n matrix. Corresponding to A we 
introduce a digraph G(A), defined in the following way. The di- 
graph G(A) has m +n vertices and these are colored either black 
or white. There are m black vertices, in one-to-one correspon- 
dence with the rows of A, and they are denoted by the numbers 
1,2,...,m. There are n white vertices, in one-to-one correspon- 
dence with the columns of A, and they are denoted by 1,2,...,n. 
There is an edge from each black vertex to each of the white ver- 
tices. Drawing the black vertices in a column and the white ver- 
tices in another column to the right, all edges are directed from 
left to right. To the edge going out from the black vertex 7 and 
terminating at the white vertex j we let correspond the matrix 
entry aij, where a;j is called the weight of the edge. The digraph 
G(A) is called the König digraph of the matrix A. The edges of 
the Konig digraph are in one-to-one correspondence with the po- 
sitions of the matrix, with each edge weighted (or labeled) by the 
entry of A in the corresponding position. In summary, the vertices 
of a Konig digraph are of either color black or white, and the sets 
of black vertices and of white vertices have labels that are consec- 
utive ordinal numbers beginning with 1; the edges can have any 
numbers as labels. Any digraph with these properties is the König 
digraph of a matrix. In fact, as should be clear, the Konig digraph 
is just an alternative structure to a rectangular array for viewing 
a matrix. The type of a König digraph is m by n (or m x n), if 
there are m black vertices and n white vertices. 


Example 2.2.1 The Konig digraph of the matrix 
1 2 3 
en | 456 | 


is displayed in Figure 2.1. 


The digraph G(A) of the matrix A is called the König digraph, 
because Konig used the corresponding bipartite graph in his pa- 
pers (see [56]). In fact, D. Konig, the Hungarian mathematician 
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who is considered to be the founder of modern graph theory, was 
the first to use graphical methods in matrix theory [55], [56]. There 
are many mathematical papers in which results from the matrix 
theory are obtained or proved by graph-theoretical means (see the 
Coda). 


Figure 2.1 


The matrix operations defined in the last section have coun- 
terparts for the König digraph, and we define these now. 


Definition 2.2.2 Let Gi and Ga be two König digraphs. 


1. Digraph Sum: Assume that G, and G2 are of the same type. 
Then their sum G + Ga is the Konig digraph of that same 
type, where the weight of the edge from black vertex i to 
white vertex 7 is the sum of the weights of the corresponding 
edges of Gy and Gp. 


2. Digraph Composition: Let G be of type m by n, and let Ga 
be of type n by p. Then the number of white vertices of G1 
equals the number of black vertices of Ga. The composition 
G1 * Gz, is the digraph of type m by p obtained by identifying 
each white vertex of G with the correspondingly labeled 
black vertex of Ga. The digraph Gi * Ga has vertices of 
three colors: black (the black vertices of G1), white (the 
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white vertices of G2), and gray (the vertices obtained by 
identifying the white vertices of G4 with the black vertices 
of G2). Note that G1 * Ga, having vertices of three different 
colors, is not a Konig digraph. 


3. Digraph Product: Let G be of type m by n, let Ga be of 
type n by p, and consider the digraph composition Gi * Go. 
The product G1 - Ga is the König digraph of type m by p 
whose black vertices are the black vertices of Gi * Ga and 
whose white vertices are the white vertices of Gi * Ga. The 
weight of the edge from the ith black vertex to jth white 
vertex of Gi : Go equals the sum of the weights of all paths 
of length 2 between the ith black vertex and the jth white 
vertex of Gi * Ga. (There are n such paths, and, in general, 
the weight of a path is the product of the weights of each of 
its edges.) 


4. Scalar Multiplication of a Digraph: Let c be a scalar. Then 
c- Gi (or, sometimes, more simply, cGı) is the digraph ob- 
tained from G, by multiplying the weight of each of its edges 
by c. 


Example 2.2.3 Two digraphs, Gi and G2, together with their 
composition Gı * Ga and product G1 - G2, are displayed in Figure 
2.2. In that figure, only the weights of some of the edges are 
given; namely, the weights of the edges leaving black vertex 1 in 
G1, the weights of the edges terminating in white vertex 1 in Go, 
and the weight of the edge from black vertex 1 to white vertex 
lin G,- Go. 


It should be clear that digraph addition and scalar multiplica- 
tion of a digraph correspond to matrix addition and scalar multi- 
plication of a matrix, respectively. More precisely, if A and B are 
matrices of the same size, then 


G(cA) = cG(A) and G(A + B) = G(A) + G(B). 


An analogous conclusion holds for product, and we state and prove 
this in the next theorem. 
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G+ G2 


Figure 2.2 


Theorem 2.2.4 Let A = [a,;] be a matrix of type m by n, and let 
B = [bij] be a matrix of typen by p. Then 


GIA: B) =G(A)-G(B). 
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Proof. In the composition G(A) «G(B), there is a path of weight 
Qikbkj from the ith black vertex to the jth white vertex that passes 
through the kth gray vertex for each k = 1,2,...,n. Hence the 
sum of the weights of all the paths of length 2 from the ith black 
vertex to the jth white vertex is 


n 
I Gijbjk, 
j=l 


and this equals, according to the definition of a matrix product, 
the (i, 7)-entry of AB. 


In the next theorem we collect some basic properties expressed 
in terms of graph operations and the corresponding matrix opera- 
tions. In the theorem we assume that the types of the graphs and 
matrices are such that the operations can be carried out. 


Theorem 2.2.5 The following properties hold: 


1. The composition of König digraphs is an associative opera- 
tion: 
Gy * (Ga * G3) E (Gi * G2) * G3. 


2. The product of König digraphs is an associative operation: 
Ga- (G2 - G3) = (Gi + G2) - G3. 
Equivalently, for matrices we have A,(A2A3) = (A1 A2)A3. 
3. Graph multiplication is distributive over addition: 


G,- (G2 + G3) = Gi- Ga + G1- G3 and 
Eee rer 


Equivalently, for matrices we have A, (Ag + A3) = AıAa + 
AıAs and (Ay + Az) As = AıAs + AsAs3. 


Proof. These relations are readily verified. The equivalence of the 
distributive properties of graph multiplication and matrix multi- 
plication are a consequence of 


G(A) (GEB) + G(C)) = G(A) - G(B) + G(A)-G(C) 
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and 


We now relate the König digraph to transposition. The König 
digraph G(AT) of the matrix AT is obtained from the König di- 
graph G(A) of A by changing the color of black vertices to white, 
changing the color of white vertices to black, and then changing 
the orientation of all edges so that once again edges go from a 
black vertex to a white vertex. 


Example 2.2.6 For the matrix 
12 3 
N | 456 | i 
given along with its König digraph in Example 2.2.1, the transpose 
AT equals 


AT = 


1 
2 
3 


4 
5 
6 
3. 


and its digraph is given in Figure 2 


Figure 2.3 
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The anticommutativity property (AB)? = BTAT can be un- 
derstood in terms of the Konig digraph. Consider the composition 
digraphs G(A) x G(B) and G(B*) x G(AT). If in G(A) « G(B) 
we make the black vertices white and make the white vertices 
black and change the orientation of all edges, then we get the di- 
graph G(B?) *G(A‘). This implies that (AB)? = B7 A’. Finally 
we note that using induction we get the more general product- 
transposition rule: 


(AA Ag)” = A AA  (k > 2). 


To conclude this section we establish a convention that at times 
is helpful in both presentation and understanding. By definition, 
in a Konig digraph G there is an edge from each black vertex 
to each white vertex. However, if the weight of an edge is zero 
(corresponding to a zero entry in a matrix), then we can can just 
delete the edge from G. Thus, with this convention, a Konig 
digraph has black vertices 1,2,...,m and white vertices 1,2,...,n 
and there are edges from some of the black vertices to some of 
the white vertices. This convention does not influence our matrix 
calculations. For example, in the proof of Theorem 2.2.4, the path 
of length 2 in the composition G(A) x G(B) from the ith black 
vertex to the jth white vertex that passes through the kth gray 
vertex disappears if either 


(i) in G(A) there is no edge from the ith black vertex to the kth 
white vertex (because aj, = 0), or 


(ii) in G(B) there is no edge from the k black vertex to the jth 
white vertex (because bx; = 0). 


If (i) or (ii) holds, then the path of length 2 in the composition 
G(A) x G(B) has weight aj,b,; = 0, and our calculation is not 
affected. 


This same convention can be applied to any digraph with 
weighted edges. 


42 CHAPTER 2. BASIC MATRIX OPERATIONS 


Example 2.2.7 Consider the permutation matrix 


po 100] 
0001 
[1000] 


0 0 1 0 


corresponding to the permutation o = 2413 of {1,2,3,4}. If we 
apply our convention, then the König digraph G(P) has only four 
edges, each of weight equal to 1: an edge from black vertex 1 to 
white vertex 2, an edge from black vertex 2 to white vertex 4, 
an edge from black vertex 3 to white vertex 1, and an edge from 
black vertex 4 to white vertex 3. In general, the Konig digraph of 
a permutation matrix of order n corresponding to the permutation 
o = kiko... ky, has n edges of weight 1, namely, the edges from 
black vertex i to white vertex k; (i = 1,2,...,n). There is exactly 
one edge beginning at each black vertex and exactly one edge 
terminating at each white vertex; these edges can be regarded as 
defining a one-to-one corespondence between the black vertices 
and the white vertices. 


Using our convention illuminates the proof of the following 
basic fact. 


Theorem 2.2.8 The product of two permutation matrices of the 
same order n is also a permutation matrix of order n. 


Proof. Let P and Q be the permutation matrices corresponding 
to the permutations o = kyko---k, and m = Iylo---ly, respec- 
tively. Then, in G(P x Q), there are exactly n paths of length 
2 from black vertices to white vertices, each of weight equal to 
1-1= 1, and these paths have no vertices in common. In G(PQ), 
there is exactly one edge from each black vertex to each white ver- 
tex, and these edges all have weight equal to 1. More precisely, for 


each 7 = 1,2,...,n, there is an edge of weight 1 from black vertex 
i to white vertex lp. Because kıka---k„ and Iyl2---l, are both 
permutations of {1,2,...,7}, lala. ‘x, is also a permutation of 


{1,2...,n}. Thus PQ is a permutation matrix. 
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2.3 Partitioned Matrices 


A matrix A may be partitioned into smaller matrices by inserting 
horizontal and vertical lines that partition its set of rows and its set 
of columns, respectively. Such a matrix is then called a partitioned 
or block matrix, with the resulting smaller matrices called blocks. 


Example 2.3.1 Let 


BE 
5 6 7 8 
az 8765 
4 3 2 1 


We can, for instance, partition A into blocks in the following 
way: 


The blocks of A are the matrices 


Bı = [1], B= [23], Bs = [4] 


5 6 7 8 
Be 8|, B=|7 6|, B&=]|5 
4 3 2 1 


Using these blocks we may write A as 


qa [B B Bs 
= ll Bio BaBa 


Example 2.3.2 The matrix 


44 CHAPTER 2. BASIC MATRIX OPERATIONS 


is, in fact, the permutation matrix 


0010 
0001 
ASNE O 
0100 


If a matrix is partitioned into blocks by partitioning its rows 
into u nonempty sets and its columns into v nonempty sets, then 
it is of the form 


Aıı Aıa a Ay, 
As. Abs se Ass 

A = a = A ; , (2.4) 
An Ap cur Aw 


where the A;; are the blocks of the matrix partition. The parti- 
tioned matrix A in (2.4) can be regarded as a matrix of type u by 
v whose entries are themselves matrices (the blocks). We say that 
A is a block matrix of type u byv. Using these ideas we can carry 
over our basic matrix operations to partitioned matrices. 

If c is a number, then evidently 


cA cA» ae cAi 
TE cAar cA N 5 en 
cA cA,a ere Ay 


If B is a matrix of the same type as A, and B is partitioned 
into blocks in the same way that A is partitioned in (2.4), then 
the definition of matrix addition implies that 


Au a By Ap T Bio oe Ai T By 
A i B= A1 u By Ago H Bə A . As, a Ba 
Am T Bun Aus I Bug Aung T Buw 
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The relationship of matrix multiplication to partitioned matrices 
is a little more subtle. 

Now let A = [a,;] and B = [b;;] be matrices of types m by n 
and n by p, respectively. Assume that A and B are partitioned as 
block matrices of types u by v and v by A, respectively: 


My Miz zZ My, Ni Nı2 Fa Nix 
A Ma Ma m Moy l B= Nar Na i Nax 
Mun M cae Mw Na Ny = NA 


Assume also that the column partition of A agrees with the row 
partition of B. This means that Mj, is an m, by ną matrix and Nz; 
is an ng by pj matrix. Here the integers m, n, and p are partitioned 
as Mm = Mmi +M +i HMen = ni F Nn +: H Ny, and p = 
pı + po +--+ py. Under these circumstances, we say that A and 
B are conformally partitioned. 

Let the set of black vertices of G(A) be partitioned in accor- 
dance with the partition of the integer m, and let the set of white 
vertices of G(A) be partitioned according to the partition of the 
integer n. Similarly, let the black and white vertices of G(B) be 
partitioned according to partitions for n and p, respectively. In 
forming G(A) x G(B) and G(A) - G(B), one gets a natural parti- 
tioning for the product AB as 


Pu Pre rea Pıx 

PD. Pa see P 
AB = = = 5 

PaPa > Bi 


where the blocks Pj; are of size m; by pj. 

Now we have to see how to calculate the block P;,. The paths 
of length 2 that in G(A) * G(B) start from the set of m; black 
vertices and terminate in the set of p; white vertices correspond 
to block P;; of AB. These paths, of course, cross through gray 
vertices. We partition the n gray vertices into v parts according to 
the partition n = n,+ng+---+n, ofn. The paths of length 2 that 
cross through the set of ny gray vertices correspond to the matrix 
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product Mj,N,;. Because each path of length 2 in G(A) x G(B) 
crosses through exactly one of the v sets of gray vertices, we obtain 
the formula 


Py; = Ma Nik + Mi2Nok + +++ + Mi Nok- (2.5) 


In other words, matrices conformally partitioned into blocks are 
multiplied formally by the same rule as for matrix multiplication. 


2.4 Exercises 


1. Compute the matrix product 


be 23 4 
2,58 ı511[ 
4 0 


2. Let D = diag(dı,da,...,d„), and let A be a matrix of order 
n. Show that DA is the matrix obtained from A by multi- 
plying each element in row i by d; for i = 1,2,...,n, and 
that for AD we multiply each element in column i by dj. 


3. Let A be an m by n matrix and let B be an n by p matrix. 
Let a1, Q2,...,Qm be the rows of A and let 71, 72,...,Yp be 
the columns of B. Show that the rows of AB are 


Q,B,a9B,...,AmB 
and the columns of AB are 
An, Aya, prank Ap. 


Conclude that if A has a row of all zeros, so does AB, and 
that if B has a column of all zeros, so does AB. 


4. Let A and B be upper (lower) triangular matrices of order 
n. By using the Konig digraph, show that the matrix AB is 
also upper (lower) triangular matrix. 
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5. 


10. 


Construct the König digraphs of the two matrices in Exercise 
1, and compute their digraph composition and product. 


Let 

Qil G12 Q13 Aa G15 

A a21 Q22 Q23 Q24 Q25 

a31 432 Q33 Q34 435 

G41 Q42 Q43 Q44 Q45 
Let P be the permutation matrix corresponding to the 
permutation 2341 of {1,2,3,4}, and let Q be the permu- 
tation matrix corresponding to the permutation 43512 of 
{1,2,3,4,5}. Using the König digraph, compute PAQ. 


. Let r be a nonnegative integer, and define H, = [at to be 


the matrix of order n with he = Oppel ij <n). 
(Here the subscript j — r + 1 is whichever of 1,2,...,n it is 
congruent to modulo n.) First show that H, is a permuta- 
tion matrix, and then, using the König digraph, show that 
H, H, = Hp+q whenever p and q are nonnegative integers. 


. Let I„(i, j) be the (permutation) matrix of order n obtained 


by interchanging rows 7 and j of the identity matrix In. 
Thus, [,(i,7) = In(j, i). Show that the following identities 
hold: 


In(i, 5)” = In and In(i, k)In(k, j) In(j, i) = In(k, 5). 


. Let P be a permutation matrix of order n. Use the König 


digraph to prove that 


PPS PEP =y 
Using block multiplication, compute the product 


Tel Os: | To hj h 
Oz Is Is 


Chapter 3 


Powers of Matrices 


In this chapter we consider powers of square matrices and describe 
them in terms of a digraph, different from the König digraph, that 
we associate with a square matrix. The basic result here is a 
theorem by which the entries of powers of a square matrix can 
be calculated by enumeration of certain walks in the associated 
digraph. As applications we consider Markov chains, finite au- 
tomata, and counting permutations with certain restrictions. We 
also show how a certain structured matrix called a circulant results 
from the powers of a matrix whose digraph is a cycle. 


3.1 Matrix Powers and Digraphs 


An associative groupoid is a pair (X,-) consisting of a nonempty 
set X and a binary operation, denoted by the usual multiplica- 
tion symbol -, that satisfies the associative law. The associative 
groupoid may have an identity element e satisfying a-e = e-a = a 
for every element a in X. In an associative groupoid (X,-), for 
every element a € X and every nonnegative integer k, the kth 
power of a is defined inductively as follows: 


Ja, if k=, 
=] a-a, ifk>1. 


If (X,-) has an identity element e, then we also set a? = e. 
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The set of square matrices of a given order n over a field is an 
associative groupoid under multiplication with identity element 
equal to the identity matrix In. Thus nonnegative integral powers 
of a matrix A of order n are defined by AP = In, A! = A, and A* = 
A. AF! for k > 1. 

Now that we have defined matrix powers, we can also define 
matrix polynomials in a natural way. If 


p(x) = aoz" + at"! ++ ayia + ak 


is a polynomial of degree k, where the coefficients ag, @1,...,đk 
are elements of a field F and A is a square matrix over F, then 
we define the matrix polynomial p(A) (really, the evaluation of a 
polynomial at a square matrix) to be 


p(A) = ay A* + At +... tap jAta;zl. 


Example 3.1.1 Let 


Then 
een _ 7 12 1 2 10 
p(A) = 2A A+3Ia a 31 3 5 01 
_ 7 10 
15 27 


Let A = [aij] be a matrix of order n. We associate with A a 
digraph D(A) with n vertices. The vertices of D(A) are denoted 
by 1,2,...,n. (Unlike the König digraph, the vertices correspond 
simultaneously to the n rows and the n columns of A.) There 
is an edge from vertex i to vertex j of weight a;; for each i,j = 
1,2,...,n. Thus D(A) has a loop at each vertex i of weight aj. 
As with the Konig digraph, an edge of weight zero, corresponding 
to a zero entry of A, can be removed from D(A) without any effect 
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in our subsequent calculations; indeed, removing such edges can 
have the effect of making certain calculations more transparent as 
it reveals more clearly the structure of the digraph. The weight of 
a walk in D(A) is defined to be the product of the weights of all 
edges of the walk. Powers of a matrix can be calculated using the 
digraph D(A). 


Theorem 3.1.2 Let A = [a;,] be a matrix of order n. For each 
positive integer k, the entry a of AE in the ith row and jth 
column equals the sum of the weights of all walks in D(A) of length 
k from vertex i to vertex j. 


Proof. We shall give two proofs of this result: the first uses 
directly the digraph D(A), and the second uses the Konig digraph 
G(A) in an auxiliarly way. 


First proof. We proceed by induction on k. If k = 1, the theorem is 
a direct consequence of the definition of the digraph D(A). This is 
because for each 2 and j, there is exactly one walk of length 1 from 
i to j and it has weight a,,. Now assume the theorem holds for 
the integer k. By the definition of matrix powers, A*t! = A . AF, 
and so by matrix multiplication we get 


ao" = anal, + aia +: + na) = 5 apa: 
r=1 
By the inductive assumption, for each r = 1,2,...,n, g is the 


sum of the weights of all walks of length k in D(A) from vertex 
r to vertex j. Consider a walk y of length k + 1 from vertex i to 
vertex j. The walk y consists of an edge from i to r for some r 
between 1 and n, followed by a walk y’ of length k from r to j. 
The weight of y equals a; times the weight of y’. Conversely, a 
walk y’ of length k from r to j preceded by the edge from 7 to r 
gives a walk y of length k+ 1 from i to 7 whose weights satisfy this 


‘Now we see the advantage of suppressing edges of weight zero. The weight 
of a walk that contains an edge of weight zero is zero. If we suppress the edge 
of weight zero, then the walk “disappears” and so makes no contribution to a 
sum. 
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same rule. It follows that apa equals the weight of all walks of 
length k from i to j whose first edge is an edge from i to r. Hence 
Sr ale) is the sum of the weights of all walks of length k +1 
from i to 7, completing the induction and proving the theorem. 


Second proof. Here we will be more brief. Consider the digraph 
G(A)® = G(A) * G(A) x -- -x G(A) equal to the composition of k 
copies of the König digraph G(A). This digraph has k + 1 sets of 
n vertices with the first set black, the last set white, and all others 
gray. The sum of the weights of all walks from black vertex i to 
white vertex j equals a. There is a one-to-one correspondence 
between walks of length k in G(A)®) from black vertex i to white 
vertex j and walks of length k in D(A) from its vertex i to its 
vertex i Moreover, corresponding walks have the same weight. 
Thus an equals the sum of the weights of all walks of length k in 
D(A) from i to j. 


We give an example that demonstrates the usefulness of The- 
orem 3.1.2. Omitting the edges of weight zero (if there are, rela- 
tively speaking, many such edges) allows one to identify walks in 
a digraph more readily. 


Example 3.1.3 The digraph D(A) corresponding to the matrix 


ab 
a= [aa] 
is drawn in Figure 3.1. 
a b Ç 
: 1 2 i 
Figure 3.1 


From the vertex 1 to the vertex 1 there is only one walk of length 
k, and its weight is a“. Similarly, there is only one walk of length 
k from vertex 2 to itself and it has weight ct. From 1 to 2 there 
are k walks of length k. These are the walks of weight aber! 
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consisting of i loops of weight a, followed by the edge of weight b, 
and then k — 1 — i loops of weight c. Here i can be any integer 
from 0 to k — 1. Therefore, the element on the position (1,2) of 
the matrix A* is equal to DZ, a’c*1*. From the vertex 2 to the 
vertex 1 there are no walks. Hence we have 


ar b(a*-1 + ate +---+ ch) 


Ks 
a 0 ck 


A square matrix is nilpotent provided there is a positive inte- 
ger k such that A* = 0. Note that it follows from the inductive 
definition of matrix powers that if A* = O, then A” = O for all 
r > k. An arbitrary matrix is nonnegative if all its entries are 
nonnegative numbers. Using Theorem 3.1.2, we can characterize 
nonnegative matrices A that are nilpotent. 


Theorem 3.1.4 Let A be a square matrix of ordern. Then A 
is nilpotent if the corresponding digraph D(A) does not have any 
cycles; in this case, A" = O. A nonnegative square matrix A is 
nilpotent if and only if the corresponding digraph D(A) does not 
have any cycles. 


Proof. Applying Theorem 3.1.2, we see that A is a nilpotent 
matrix if and only if there exists a positive integer k such that the 
digraph D(A) contains no walk of nonzero weight of length r for 
all r > k. If D(A) does not have a cycle, then there can be no 
such walk of length n or greater, since such a walk would repeat a 
vertex and thus create a cycle. Hence A is nilpotent and A” = O 
if D(A) does not have any cycles. 

Now suppose that A is a nonnegative matrix and A is nilpotent. 
If D(A) contains a cycle, then D(A) has walks of arbitrary long 
length of positive weight, contradicting the assumption that A is 
nilpotent. 


Note that if a matrix A of order n is nilpotent, then A” = O 
for all r > +1. This is because in the digraph D(A) with n 
vertices, if there is a closed walk, then there is a cycle of length at 
most n. 
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Figure 3.2 


Example 3.1.5 A strictly upper triangular matrix is an upper 
triangular matrix that also has only zeros on its main diagonal. 
Thus a matrix A = |a;;| of order n is a strictly upper triangular 
matrix if and only if a;; = 0 for all ¿ and ¿i with j < i. Let A be 
a strictly upper triangular matrix. Then its digraph D(A) does 
not have any cycles since all edges go from a vertex i to a vertex 
j with 7 >i. By Theorem 3.1.4 the matrix A is nilpotent. 


Example 3.1.6 Theorem 3.1.4 asserts, in particular, that the di- 
graph of a nonnegative nilpotent matrix cannot have a cycle. The 
assumption that the matrix is nonnegative cannot be omitted. The 
digraph of the matrix 


000 0 1 al 
0 0 1 —1 0 0 
—1 0 0 00 0 
100 00 0 
0110 00 = 
0110 00 30 


is displayed in Figure 3.2. The matrix is nilpotent, yet the digraph 
has cycles.? Of course, if D(A) does not have any cycles, then A 


?This example was constructed by Z. Lukić. 
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is nilpotent, but, as the example shows, the converse is not true 
in general. 


Example 3.1.7 Let a digraph be obtained from a tree by orient- 
ing its edges in an arbitrary way, and assign weights to the edges 
in any way whatsoever. Consider a matrx A with D(A) equal to 
this digraph. Because D(A) does not contain cycles, the matrix 
must be nilpotent. For example, the matrices 


0 —2 
0 
0 
0 


O u ome 
ooo oe 
S'S: OS 


0 0 0 
1 0 0 
0 5 and 0 
0 0 0 


Sr 


are constructed from a path of 4 vertices in this way and are 
nilpotent. 


It follows from Theorem 3.1.4 that whether or not a nonneg- 
ative matrix is nilpotent depends only on the structure of the di- 
graph and not on the weights of its edges. There are also other 
properties of matrices that depend only on which elements are 0 
and do not depend on the values of the elements different from 0. 
Such characteristics are described in a natural way by means of 
digraphs. 

We next describe an application of Theorem 3.1.2 in probability 
theory. 


Example 3.1.8 Some random processes can be described by the 
following model: 


Let G be a digraph with n vertices, containing all possible 
edges, including a loop at each vertex. Consider the vertices to 
be states of a system, and imagine that an object, let us call it a 
particle, moves in a random way along the edges of the digraph in 
the direction of the edge. Normally, the particle is on a vertex of 
the digraph and at moments of time t = 1,2,... moves along an 
edge to another vertex or the same vertex in case of a loop. If the 
particle at some initial moment t = to is on a vertex i (in state i), 
at the next moment t = to + 1 it has moved to vertex j (to state 
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j) with probability p;; > 0, (i,j = 1,2,...,n). The values p;; are 
independent of the value of the discrete time variable t and, in 
order that we have probability distributions for the transition at 


each vertex, the p;; satisfy the condition 
Pi t+Pp2t-:-+tpn=1 (@=1,2,...,n). 


In particular, the digraph of P satisfies D(P) = G.’ 

The matrix P = [p;;] of order n is called the one-step transition 
probability matrix. We call such a digraph G a Markov chain, and 
indeed use this term for the matrix P itself. 

It is interesting and important to investigate the behavior of 
a Markov chain over a long period of time. Of special interest 
are those cases when some of the values p;; are equal to 0. When 
drawing the digraph G, we may omit an edge from a vertex i to 
a vertex j of weight 0, since the particle cannot move from i to 
j in this case. In this way, and as we have discussed earlier, the 
structure of the digraph, and hence the important characteristics 
of the Markov chain, become clearer in the reduced digraph. 

Because the values of the transition matrix P are the same for 
all times, the probability that the particle gets from a vertex i in k 
steps to a vertex j along a fixed walk (of length k) is equal to the 
product of weights of the edges along that walk, i.e. to the weight 
of the walk. According to Theorem 3.1.2, we conclude that the 
probability of getting from a state i to a state j in k steps (along 
any walk) equals the element p in position (i, 7) of the matrix P*. 
Therefore, the behavior of a Markov chain is determined by the 
structure of the matrices P* (k = 1,2,...). Almost all interesting 
characteristics of a matrix PF can be determined by means of the 
structure of the corresponding digraph, while the weights of the 
edges affect only the quantitative characteristics of the Markov 
chain (see Chapter 8). 


3Here is an amusing formulation of this problem. Think of the digraph as 
a map of a city and the particle as a drunkard who is trying to get home (one 
of the vertices of the digraph). At each intersection, he chooses one of the 
streets to take according to the given probabilities. The question arises as to 
whether the drunkard will reach home (of course, depending on his level of 
inebriation, he may or may not recognize his home!). It turns out that under 
some mild conditions, the drunkard will reach home with high probability. 
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To conclude this section, we discuss briefly an application of 
Theorem 3.1.2 in finite automata theory. 


Example 3.1.9 A finite automaton is a map from finite sequences 
of a certain finite set of symbols (the input symbols) into finite 
sequences of another finite set of symbols (the output symbols). A 
finite automaton can be represented by a digraph G. The vertices 
1,2,..., n of the digraph represent the states of the automaton in 
discrete moments of time t = 0,1,2,.... If in some moment of time 
the automaton is in a state 7, and if it is affected by a symbol x; of 
the input alphabet, the automaton goes to a new state determined 
by i and x;, while a symbol of the output alphabet, also determined 
by the i and x, appears at the output. Let X = {21,%3,...,&n} 
be the set of all input symbols. In the following we consider the 
input symbols £1, %2,..., £n as variables. We extend this set by 
the empty symbol with the meaning that at a given moment there 
is no symbol affecting the input of the automaton. This empty 
symbol is denoted by 0, and we will sometimes, according to need, 
interpret it as the number 0. If the symbols 7;,,2%;,,...,2;, are 
those that turn the automaton from a state i to a state j, the edge 
of the digraph joining the vertices 7 and 7 gets the following sum 
as its weight: 

dy Sie ta Ae ee (3.1) 


If a;; = 0, then this means that it is impossible to get from the 
state į to the state j, and the automaton stays in state 7. In 
analogy with Markov chains, the matrix A = [a;;] of order n is 
called the transition matrix of the automaton. 

Consider a walk of a length k between vertices i and j. The 
weight of that walk is the product of values of the form (3.1). If 
we multiply k sums of the type (3.1), we get a sum of products, 
every summand being the product of k members of the set X. We 
assume that the multiplication of the elements of X is noncom- 
mutative, so that the k factors in every product maintain their 
original order. If £j £j, -++ £j, is some summand from the weight 
of the walk, the sequence of input symbols X;,,%j,,...,;, sends 
the automaton from the state i to the state j. Therefore, the sum 
of the weights of all walks of length k between vertices 7 and j 
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produces the set of all sequences of k input symbols that turn the 
automaton from the vertex i to the vertex j. According to The- 
orem 3.1.2, the element in position (i, j) of the matrix A* is the 
sum of terms, where each term is the product of k elements of the 
set X, and every such product determines a sequence of k symbols 
that sends the automaton from the state 7 to the state j. 


3.2 Circulant Matrices 


A circulant is a square matrix of the form 


ao ay Q2 *** An-2 An-1 
An-1 ao a] >t An-3 An-2 
An-2 An-ı Qo *** An-4 An-3 
A= 4° es . a ie (3.2) 
a2 Az ag ‘°> ag ay 
ay Q2 Q3 °** An-ı 49 


Each row of such a matrix is a cyclic permutation of the first row; 
each column is a cyclic permuation of the first column. 

There is a representation of a circulant (3.2) in terms of powers 
of a certain permutation matrix. Let P be the permutation matrix 
of order n defined by 


0.1.0 =e 0:0 
001 0 0 

3 u Ore 00 -[° | 
MA Seek Se Se hl O 
000-01 
100.-.00 


The digraph D(P) corresponding to P is a cycle with n vertices 
and n edges from vertex 7 to vertex i + 1 for i = 1,2,...,n, where 
n + 1 is treated as 1 (i.e., computed modulo n, taking as residues 
1,2,...,n). For each positive integer k and each vertex i there is 
exactly one walk of length k beginning at i, and it terminates at 
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vertex j mod n. Thus we have that 


Peo a (k =1,2,...,.n—1,n), 
k 


in particular, P” = I„. Hence we obtain a representation of the 
circulant given in (3.2) as 


A = al +aP + aoP? +--+ ana PE. 
If we define the polynomial 
g(x) = ao + aye + aon? +--- + anar”, 


then 
Ag Py 


Let f(x) be any polynomial and divide f(x) by x” — 1 to get 
f(x) = q(a)(a" — 1) + re), 


where the remainder r(x) is a polynomial of degree at most n — 1 
(including possibly the zero polynomial). Since P” = Iņ, it follows 
that 


HEISTE), 


implying that circulants are precisely the matrices that are poly- 
nomials in the permutation matrix P. 


3.3 Permutations with Restrictions 


Let X = {21,%2,...,%n} be a set of n elements. As we saw in 
Section 1.3, an r-permutation-with-repetition of X is an ordered 
arrangement of r elements of X with repetition of elements per- 
mitted, that is, an r-tuple £i Tis... Li, where 1 < i; < n for 
El De 

When we form permutations, we may impose certain restric- 
tions. Here we consider restrictions of a very special type. Assume 
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that for each 7 = 1,2,...,n, the set X is partitioned into two sets, 
X} and X?. Thus 


X = X} U X? where X} OX? =o, (i=1,2,... n). 


We now require that the element x;, wherever it occurs in the 
permutation, be followed, if it is not the last element in the per- 
mutation, by an element of X} only (i = 1,2,...,n). Thus a pair 
Zi, £j of adjacent elements in a permutation is a permitted pair 
provided x; € X}. Define a matrix A = [a,;| of order n, where 
aij = 1 if x;, £j is a permitted pair, and a;; = 0 otherwise. The ma- 
trix A is the matrix of the permitted pairs. The matrix A obtained 
from A by replacing 0’s with 1’s, and viceversa, is the restriction 
matriz. 

We now determine the number p,, (A) of k-permutations-with- 
repetition of X if a matrix A of permitted pairs is given. Since 
A is a matrix of 0’s and 1’s, all edges, and hence all walks, of the 
digraph D(A) have weight 1. Thus the sum of the weights of the 
walks of length k from vertex i to vertex j in D(A) equals the 
number of such walks. By Theorem 3.1.2, the number of walks of 
a length k from vertex i to vertex j equals the element a) in the 
ith row and the jth column in the matrix AF. 

Denote the sum of all elements of a matrix Y by U(Y). We 
thus have the formula 


AHA, (k>1). 


3.4 Exercises 


1. Let 


A= 


=No 
DAN 
wm w 
veo 


3 0 1 
Use the digraph D(A) to compute A?, A’, and A‘. 


2. Let A = |a] be the matrix of order n defined by a,; = 
ðij-1 (i, j = 1,2,...,n), and let B = al, + A, where a is 


3.4. 
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some constant. For k a positive integer, use the digraph of 
a matrix to compute A* and B®. 


. Let A = [a,;] be a matrix of order n such that all entries of 


A satisfy |a;;| < r. Let k be a positive integer. By bounding 
the number of walks of length k in the digraph D(A), show 
that the entries a of AF satisfy ja] an, 


. Let D be the digraph obtained from the complete graph 


Ks (no loops) by replacing each edge with two oppositely 
directed edges. At each discrete time t, a particle always 
moves to a different vertex with equal probability 1/4. Com- 
pute both the one-step and two-step transition probability 
matrices P and P?. 


. Let D be a digraph obtained from K by orienting each edge 


(your choice how to orient). At each discrete time t, a parti- 
cle chooses one of the edges leaving its current location with 
equal probability. As in the previous exercise, compute both 
the one-step and two-step transition probability matrices P 
and P?. 


. For each of the three trees of order 5 (see Section 1.1), give an 


orientation to each of the edges and construct (and verify) a 
nilpotent matrix A such that D(A) is the resulting digraph. 


. Show that the product of two circulants of order n is a cir- 


culant. 


. Show that the transpose of a circulant is a circulant. 


Chapter 4 


Determinants 


In this chapter we first define the Coates digraph of a square ma- 
trix. The Coates digraph is a slight variation of the digraph used in 
the previous chapter. We use the Coates digraph to give a nontra- 
ditional definition of the determinant of a square matrix. Using 
this definition, we derive the basic properties of a determinant 
that are useful in its evaluation. In particular, it is shown how the 
calculation of a determinant can be reduced to the calculation of 
determinants of lower order. We also derive the formula for the 
determinant that is used in its classical definition and actually es- 
tablish the equivalence of the two definitions of the determinant. 
The determinant can be defined yet again in a third way—using 
the König digraph—a fact that will be useful later in the book. A 
special determinantal formula, derived in Section 4.3, will be used 
in Chapter 7. Section 4.5 describes the Laplace development of a 
determinant. 


4.1 Definition of the Determinant 


A digraph with m vertices and m edges is called a cycle digraph, 
or, more simply, a cycle, provided its vertices can be numbered as 
1,2,...,m so that its set of m edges consists of edges from vertex 
i to vertex i+1, (i = 1,2,...,m-1) and an edge from vertex m to 
vertex 1. Let D be a digraph whose set of vertices is V and whose 
set of edges is Æ. We recall from Chapter 1 that a subdigraph of 
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D is a digraph whose set of vertices is a subset U of V and whose 
set of edges is a subset F' of the set Ey of those edges of D that 
join vertices in U. Thus, to form a subdigraph of D, we choose 
some of the vertices (possibly all of them) and some of the edges 
between these vertices (again, possibly all of them). If U = V, 
then we have a spanning subdigraph of D. If F = Ey, then we 
have an induced subdigraph (on the set U). 

A linear subdigraph of D is aspanning subdigraph of D in which 
each vertex has indegree 1 and outdegree 1 (i.e., exactly one edge 
into each vertex and exactly one (possibly the same) out of each 
vertex. Thus a linear subdigraph consists of a spanning collection 
of pairwise vertex-disjoint cycles. In Figure 4.1. a digraph D is 
drawn along with its three linear subdigraphs L1, Lo, L3. 


Lı Lo 


Q O 


Figure 4.1 
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Ls 


O 


Figure 4.1 (continued) 


Let A = [a;;] be a square matrix of order n. We have already 
associated two weighted digraphs with A, the König digraph G(A) 
and the digraph D(A). We now associate a third weighted digraph 
D*(A), which is nothing more than the digraph D(A‘) associated 
with the transpose A? of A. Thus D*(A) has n vertices 1,2,...,n, 
and for each i,j there exists an edge from vertex j to vertex 7 of 
weight a;; The elements of the main diagonal of A correspond in 
D*(A) to loops of D*(A) as they do in D(A). The digraph D*(A) 
is called the Coates digraph of the matrix A.! 

Let L be a linear subdigraph of the digraph D*(A). The prod- 
uct of the weights of the edges of L is the weight w(L) of L. The 
number of cycles contained in L is denoted by c(L). By L(A) we 
mean the set of all linear subdigraphs L of the Coates digraph 
D*(A). 


Definition 4.1.1 Let A = [a;;] be a square matrix of order n. 
The determinant of A is the number det A defined by the sum 


det A=(-1)” I) (-1)w(Z), (4.1) 
LEL(A) 


where the summation extends over all linear subdigraphs Z of the 

digraph D*(A). Since (—1)"t) = (—1)"-*), another way to 
write (4.1) is 

det A= `` (—1)?-“) w(L). (4.2) 

LEL(A) 


‘We are using D*(A) rather than D(A)—that is, associating a;; with the 
edge from j to i rather than the edge from 7 to 7—because it aids in our later 
discussion. 
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The digraph D*(A) is named the Coates digraph and formula 
(4.1) is called the Coates formula, because Coates introduced these 
in [13] in developing a procedure for treating systems of linear al- 
gebraic equations (here described in Section 6.3). It is very hard 
to establish who first came to the idea of such a graphical inter- 
pretation of a determinant (see the Coda for some bibliographical 
data). Perhaps, as presented here, the definition of the determi- 
nant could be called the Harary-Coates definition since Harary, 
referring to Coates, has proposed in [45] that this formula, in a 
somewhat changed form, could be taken as the definition of the 
determinant. The standard definition is given in Section 4.4, and 
it is equivalent to the Harary-Coates definition. 


Example 4.1.2 Let 


a oar 


Q21 422 


Then D*(A) has two linear subdigraphs. One consists of the loops 
at the two vertices (so two cycles of length 1) and has weight a1, a29; 
the other is a cycle with two vertices and has weight aj2a2,. Hence, 
using (4.1), 
det A= (—1)?((—1)? 11429 + (—1)'aiza21) = 411422 — 412091. 
Now let 
Qi Q12 Q13 
A= ie A22 Q23 | 
231 432 433 
The digraph D*(A) is depicted in Figure 4.2 along with its six 
linear subdigraphs L1, Lo,..., Le. 
Applying formula (4.2) for the determinant, we get 


det A = (-1)? 3411422433 Ir (-1)? "a12a31003 IE 
(—1)*"*a21432013 he (—1)*?a11423032 AI, 


(—1)* *a2413431 Az (—1) Paszzaızazı 


= 411422433 + 412431423 + A21432413 — 


11423432 — 422013031 — 433012421. 
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From formula (4.3) we see that the determinant of a matrix of 
order 3 is an algebraic sum of six products with three factors, each 
taken according to the scheme in Figure 4.3. The + sign is ascribed 
to the product of elements lying on the diagonal a11, a@22, @33 and 
to the products of elements from vertices of two triangles having 
one side parallel to this diagonal. The — sign is ascribed to the 
product of elements lying on the diagonal a31, @22, aı3 and to the 
products of elements from vertices of two triangles having one side 
parallel to this diagonal. 


a33 [3 a33 6) 


a13 432 la 
aıı) 1 (2) ax Q11 (19 DL 
a21 
3 
23 
Lo 
a12 9 


a21 
Figure 4.2 
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Figure 4.2 (continued) 


R 


+ terms — terms 


Figure 4.3 
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As with the digraphs G(A) and D(A), we adopt the convention 
that edges of D*(A) of weight zero are not, in general, drawn. 
The advantage is that, with zeros present in a matrix A, certain 
linear subdigraphs are removed from L(A), namely, those that 
have weight zero and thus those that do not affect the value of 
the determinant of A. Thus, in calculating the determinant of a 
matrix A = [a;;] of order 3, if a12 = 0, then the linear subdigraphs 
La and Ls of weight zero do not appear in the calculation. When 
a matrix A has a lot of zeros occuring in a structured way, it may 
be possible to easily calculate the determinant. 


Example 4.1.3 In calculating the determinant of the matrix 


a 0 0 
0 a we 0 
Aı = ; ? > 5 ; 
DO: e a, 


whose only nonzero elements occur on the diagonal, we see that 
D*(A,) has only one linear subdigraph, namely, itself, and it is 
drawn in Figure 4.4. 


OOOO 


Figure 4.4 


Hence we get 
det Ay = (-1)” (-1)"a11a22 + Ann = 411422 . . . Ann- 
In calculating the determinant of the matrix 


00... 0 Qin 
0 0o Q2 n—1 0 
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whose only nonzero elements are din, 021-2, -- - ;Anı, We see again 
that D*(A2) has only itself as a linear subdigraph and it is drawn 
in Figure 4.5. 


Qin 


Anı 


Figure 4.5 


This linear subdigraph has | (n+ 1)/2]| cycles (when n is odd, one 
is a loop; the others are cycles of length 2). Thus we get 


det As = EA Aina2,n-—1 » » - Ani- 


Example 4.1.4 We calculate the determinant of the matrix 


ba0 0 
c b a O 
Ale ae 
0 0c 6b 


The corresponding digraph D*(A) is represented in Figure 4.6. 
b C b C b C b 
KORO 

a a a 


Figure 4.6 
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b b b b 
DD pa 
1 2 3 4 
: b b 
< © ® weight ab?c 
3 4 
a 
b : b 
Q «<> ® weight ab?c 
1 4 
a 
b b A 
Q © <> weight ab?c 
1 2 
a 
c c 
<> <> weight a?c? 
a a 


Figure 4.7 


In Figure 4.7, all linear subdigraphs are given together with 
the corresponding weights. Therefore we have 


det A 


(—1)* ((-1)46" + 3(—1)%ab?c + (-1)?a?c?) 


= b*!-3abc+a:c. 
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4.2 Properties of Determinants 


There are many elementary properties of determinants that are 
useful in evaluating determinants. This section is devoted to 
their derivations from our Definition 4.1.1. It is mainly based 
on the paper [22] from 1975 where one of the authors of this 
book outlined the elementary theory of determinants using graph- 
theoretical means. 

We begin with a theorem that is basically obvious as there is 
no distinction made between rows and columns in the definition 
of the determinant. 


Theorem 4.2.1 det AT = det A. 


Proof. The digraph D*(AT) is obtained from the digraph D*(A) 
by changing the orientation of all edges but not changing their 
weights. Therefore, there is a one-to-one correspondence between 
the linear subdigraphs in L(A) and those in L(AT). Under this 
correspondence both the weight and number of cycles are pre- 
served. Hence it follows from definition 4.1.1 that det A = det A’. 


Theorem 4.2.1 implies that every statement that holds for the 
rows of a matrix also holds for the columns. In this way every 
theorem becomes two theorems and, in general, we only present 
one and leave it to the reader to formulate the other. 


Theorem 4.2.2 If each element of some row of a matrix is mul- 
tiplied by c, then the determinant is also multiplied by c. 


Proof. Let each element of row i of A be multiplied by c, resulting 
in a matrix B. Then D*(A) and D*(B) differ only in that the 
weight of each edge going into vertex i is mutliplied by c in B. 
Each linear subdigraph contains exactly one edge going into vertex 
i. Hence the weight of each linear subgraph in £(B) is c times the 
weight of the corresponding linear subdigraph in £(A). Using the 
definition of the determinant, we see that det B= cdet A. 
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ol 
7 
L/L. 


Figure 4.8 


Theorem 4.2.3 If two rows in a matrix A are interchanged, the 
determinant is multiplied by —1. 


Proof. Let rows i and j of the matrix A, where i Æ j, be inter- 
changed, resulting in a matrix B. Then D*(B) is obtained from 
D*(A) by changing each edge going into vertex 7 into an edge going 
into vertex j, keeping the same weight, and viceversa. This estab- 
lishes a one-to-one correspondence between the linear subdigraphs 
Lin L(A) and the linear subdigraphs L’ in £(B) that preserves 
the weight. However, as illustrated in Figure 4.8, the number of 
cycles is either increased or decreased by 1. More precisely, the 
number of cycles is increased by 1 if vertices į and j belong to the 
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same cycle in L, and is decreased by 1 if they belong to differ- 
ent cycles. Thus (-1)7) = —1 : (-1)“%), and the theorem now 
follows. 


Theorem 4.2.4 If the elements of a row of a matrix A are equal 
to the corresponding elements of a different row, then det A = 0. 


Proof. If we interchange the two identical rows of A, then. 
by Theorem 4.2.3, the determinant gets multiplied by —1. On 
the other hand, interchanging these two rows does not change 
the matrix. and so the determinant stays the same. Therefore, 
det A = — det A, implying that det A = 0. 


Corollary 4.2.5 If the elements of a row of matrix A are propor- 
tional to the elements of a different row, then det A = 0. 


Proof. This corollary is an immediate consequence Theorems 
4.2.2 and 4.2.4. 


Theorem 4.2.6 Leti be a fixed integer with 1 < i < n. Suppose 
that row i of A is the sum of two other rows in the sense that 


Qij = a + u (l<j<n). 


Let A® and A® be the matrices obtained from A by replacing the 


element a;; of row i of A with ay and am, respectively. Then 


det A is the sum of the determinants of AY and AP: 
det A = det AY + det AM, 


More generally, if row i of A is the sum of p other rows in the 
sense that 


ay = ay tay +. tag (I<j<n), 


and A“) is the matrix obtained fom A by replacing the elements 
aij in rowi with a (1<k<p), then 


det A = det AD + det AP +... + det AM. 
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Proof. As in the proof of Theorem 4.2.2, each linear subdigraph 
of D*(A) contains exactly one edge going into vertex i. Thus 
the weight of each linear subgraph L in £(A) contains a factor 
a + a for exactly one j. This implies that there is a one- 
to-one correspondence between the linear subdigraphs L in £(A) 
and all pairs L“, L@ consisting of a linear subdigraph L® of 
L(A“) and a linear subdigraph L® of L(A®)). Moreover, in this 
correspondence, 


w(L) = w(L®) + w(L). 
Using Definition 4.1.1 we now compute that 
det A = det AY + det AM. 


The theorem in its full generality now follows easily by induction. 
O 


Theorem 4.2.7 The determinant of a matrix is unchanged if the 
elements of some row are multiplied by a number and added to a 
different row. 


Proof. This theorem is an immediate upon first applying Theo- 
rem 4.2.6 and then applying Theorem 4.2.4. 


Let u,v, ...,v) be 1 by n row vectors (or n by 1 column 
vectors), and let c1, C2, .. .,Cp be numbers. Recall that 


cu + equ?) +--+ + ev”) 
is a linear combination of vv, ..., uv. 


Theorem 4.2.8 Let A = [a] be a matrix of order n. Assume 
that some row of A is a linear combination of its other rows. Then 
det A = 0. 


Proof. Assume, for instance, that row 1 of A is a linear combina- 
tion of rows 2,3,...,n. It follows from Theorem 4.2.6 that det A 
can be written as a sum of determinants of matrices whose first 
row is proportional to some other row of A. By Corollary 4.2.5, 
each of these determinants equals zero and thus det A = 0. 
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Example 4.2.9 Let 


7T 2 19 
A=|1 -2 3 
2 4 5 


Because row 1 is three times the second row plus two times the 
third row, det A = 0 by Theorem 4.2.8. 


Definition 4.2.10 Let A = [a;i] be an m by n matrix. Let K = 
{i1, i2,..., ip} be a set of k elements with K C {1,2,...,m}, and 
let L = {j1.j2,. . . , jı} be a set of l elements with L C {1,2,...,n}. 
The sets K and L designate a collection of row indices and column 
indices, respectively, of the matrix A, and the k by l submatrix 
determined by them is 


Qij Qija > Qij 

Ging, Ginjg `` Qizjı 
A|K, L] = a. 

Giggr Ginga 7 Uingr 


If L = K, then A[K, K] is a principal submatrix of A, sometimes 
denoted more simply as A[K]. 

The determinant of a square submatrix of A is called a minor 
of A. Thus a minor of A equals det A[K, L], where |K| = |Z]. If, 
in addition, X = L, then det A[K] is a principal minor of A. 

Now assume that A is a square matrix of order n. Let i and 
j be integers with 1 < i,j < n. Let Aj; be the submatrix of 
A of order n — 1 obtained by striking out row 7 and column j 
of A (thus, in the above notation, A;; = A[K, L], where K = 
{1,2,...,4-—1,i4+1,---,n} and L= {1,---,j-1,7+4+1,---,n}). 
The cofactor (or algebraic complement) A;; of the element a,; of 
the matrix A is given by 


Qij = (—1) det Aij- 


Note that the matrix A;;, and hence the cofactor a;;, do not de- 
pend on any of the elements in row 7 and column 7 of A. 
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We now obtain a recursive formula for the determinant by 
showing how the determinant of a matrix of order n can be eval- 
uated in terms of the determinants of n matrices of order n — 1. 


Theorem 4.2.11 The determinant of a matrix A = |a;;| of order 
n can be evaluated by developing along row i as follows: 


n 


det A = > QijQij = Na; det Aij (i = 1, 2, ART ri) 


j=l j=l 
It can also be evaluated by developing along column j: 


det A = Y aijaij = S°(-1)' ai; det Aij (j = 1, 2, wee ,n). 


i=1 i=1 


Proof. Because of Theorem 4.2.1, it suffices to prove the for- 
mula for development along row i. In addition, by Theorem 4.2.3, 
it suffices to prove the theorem for i = n. This is because by 


successively interchanging row 7 with rows i + 1,7 + 2,...,n, we 
obtain a matrix B = [b;;|, where in B the rows of A are in the 
order 1,...,2 —1,2+1,...,n,2. With these interchanges we have 


det B = (—1)""*det A. Moreover, using the notation in Definition 
4.2.10, we have b„; = aj; and B,; = Aij. Hence, developing the 
determinant of B along its row n, we get 


det A = (-1)""detB 
= (—1)""* So (-1)" b,j det Brj 
j=l 
= (-1)?* 30 e ay det Ay 
j=l 
= > ,(-1)"Ha;; det Ai; 
j=l 
= > (-1)'aj; det Aj;. 
j=l 


So we need only establish the case i = n, and we proceed to do so. 
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Each term in the sum in Definition 4.1.1 for the determinant 
contains one element from each row of A, since in a linear subdi- 
graph there is exactly one edge coming into each vertex. Similarly, 
each term contains one element from each column. Hence we may 
write 


det A = 5 Ani (4.3) 
j=l 


where (3; does not depend on any of the elements in row n and 
column j of A, (j = 1,2,...,n). The terms in this summa- 
tion correspond to a partition of the linear subdigraphs in L(A) 
into £,(A), £2(A)...,£,(A), where £;(A) consists of those lin- 
ear subdigraphs where the edge from vertex j goes to vertex n 
(j = 1,2,...,n). The linear subdigraphs L in £,(A) contain a 
loop at vertex n. Deleting that loop (so a cycle) from L, we get 
a linear subdigraph L’ in L(A,n), where w(L) = a,nw(L’) and 
c(L) = c(L') +1, and so (—1)“) = —(-1)*). Therefore 


B= Dl). FR) 


L'EL(Ann) 
= (=° J, (1) w(L) 
L'EL(Ann) 
= det Ann 
(=1)""™ det Ann = Qnn- 


We now consider the coefficient 3; of anj in (4.3) where 1 < j < 
n. By successively interchanging column j with columns j + 1, j + 
2,...,n we obtain a matrix C = [c;;], where in C the columns of 
A are in the order 1,...,j7 — 1,j + 1,...,n,j. We have det C = 
(-1)””’ det A. Using the notation in Definition 4.2.10, we have 
Cnn = Anj and Cnn = Any. It follows from what we have proved in 
the preceding paragraph that 


Bi = (-1)°7 det Can = (—1)"*3 det Ong = (—1) H det Anj = On: 


Therefore we have a 
det A = 5 AnjAnj, 


j=l 


as desired. 
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Example 4.2.12 Let 


eFPnwmworF 
ww o 
wNOwhbd bw 
ork OF 


Developing the determinant along row 2 and taking into account 
the two 0’s in row 2, we get that 


f: 2 A E 0 al 
det A = (-1)*-3-det| 2 3 4 |+(-1)°-2-det| 2 1 4 
120 a. 3:0) 

= (-1)*-3- (I) +(-1)*-2-(-7) 


3+14=17. 


The two determinants of order 3 can be computed either by the 
formula given in Example 4.1.2 or by further determinant devel- 
opment. 


We conclude this section by deriving two more important prop- 
erties of the determinant. 


Theorem 4.2.13 Let 


[A 0 
4=|9 4 | 


where A, and Ag are square submatrices of A. Then 
det A = det A; det Ag. 
In particular, the determinant of A does not depend on B. 


Proof. Let A, A1, Aa be matrices of orders n, 1, no respectively, 
where n = nı + na. The digraph D = D*(A) is formed from 
the digraphs Dı = D*(A,) and Da = D*(As) by including some 
edges that go from the vertices of Dı to the vertices of Da. These 
edges correspond to the nonzero entries of B. Because no edges go 
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from Ds to D4, no cycle of D contains edges corresponding to the 
nonzero entries of B. Therefore, each linear subdigraph L of D 
consists of the union of a linear subdigraph Lı of Dı and a linear 
subdigraph Lə of Dz. Moreover, every such union gives a linear 
subdigraph of D. We thus have 


c(L) = c(L1) + c(L2) and w(L) = w(L.)w(L>), 
and 


det A = (-1)” Y (-1)w(LZ) 


LEL(A) 


= (-1)Pıtr2 5y = (—1) D+) w, Li Jw(La) 


Li EL(A1) La€EL(Aa) 
(AE ETS Ir E L) 
LıeL(Aı) L2€L(A2) 
= det A, det As. 


Theorem 4.2.13 can be used to show that the determinant is a 
multiplicative function. 


Theorem 4.2.14 If A and B are square matrices of the same 


order n, then 
det AB = det A det B. (4.4) 


Proof. From Theorem 4.2.13 we get 


det | E: x | SAIE (4.5) 

We multiply column 1 of the matrix of order 2n in (4.5) by b11, 
column 2 by baı, ..., and column n by b„ı, and add each of them 
to the column n + 1. Furthermore, we multiply column 1 by b12, 
column 2 by b22, ..., and column n by b„2, and add each of them 
to the column n + 2. Continuing like and using the fact that by 


Theorem 4.2.7 the determinant is unchanged, we obtain 


A 0 A AB 
det | 7 2 [nae | oF |: (4.6) 
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Now by n interchanges of pairs of columns (1 and n+ 1, 2 and 
n+2,...,n and 2n) we get by Theorems 4.2.3 and 4.2.13 that 


= (—1)" det( AB) det(— Zn) (4.7) 
= (-1)" det AB (-1)" = det AB. 
Combining (4.5), (4.6), and (4.7), we obtain (4.4). 


Theorem 4.2.14 asserts that the determinant of a product of 
square matrices of the same order equals the product of the de- 
terminants of each of the two matrices. The product AB of two 
nonsquare matrices A and B may be a square matrix and then 
will have a determinant, even though neither factor does. In fact, 
this happens exactly when A is an m by n matrix and B is an n 
by m matrix for some integers m and n. In this situation, AB is 
a square matrix of order m, and it is natural to ask whether or 
not there is a formula for det(AB) that generalizes the product 
rule of Theorem 4.2.14. Such a formula exists, and it is called the 
Binet-Cauchy formula. 


Example 4.2.15 Let 


Then 


AB = | br by bz 
c£ Cy cz 
Applying Corollary 4.2.5, we see that since rows 1 and 2 are pro- 
portional (as are rows 1 and 3), det( AB) = 0. 
This example is a special case of a more general situation. Let 
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be m by 1 and 1 by m matrices, respectively. Then 


ajbi aba +++ Abm 

abı abs +++ Aobm 
Ap 2 1 2 2 2 

Ambı Amn do Pree Geb 


is a square matrix of order m. Each pair of rows and each pair of 
columns is proportional. Thus, if m > 1, det(AB) = 0. 


Let A = [a;;| and B = [b;,]. Let the columns of A be the m 
by 1 matrices Cy, Co,...,C,, and let the rows of B be the 1 by 


m matrices Rı, Ro,..., Rn. Then it follows from the definition of 
matrix multiplication that 


This is because the element in position (i,j) of AB is aibi; + 
ajgb2; ++ +++ Qinbnj, and this is the sum of the elements in position 
(i, 7) of the matrices C1 R1, CoRo,...,CrRn.- 

The next theorem contains the Binet-Cauchy formula. 


Theorem 4.2.16 Let A and B be m byn and n by m matrices, 
respectively. If m >n, then det(AB) =0. Ifm <n, then 


det(AB) = X` det A[{1,2,...,m}, K] det B[K, {1,2,...,m}], 


(4.9) 
where the summation extends over all subsets K of {1,2,...,n} 
of cardinality m. 


Proof. First assume that m > n. Then, from (4.8), we conclude 
that the columns of AB are linear combinations of the n columns 
of B. Applying Theorem 4.2.8 and 4.2.2, we see that det A is a 
sum of multiples of determinants of matrices, each of whose m 
rows is one of the n rows of B. Because m > n, each of these 
matrices has two equal rows and so by Theorem 4.2.4, det A = 0. 

Now let m < n. Let C be the matrix of order m + n defined 
by 

A O 
o-[4 2] 
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Similar to the proof of Theorem 4.2.14, we obtain by interchanging 
each of the last m columns with each of the first n columns in turn?, 
that 


(4.10) 


det C = det | A | 


-l B 
Similar to the proof of Theorem 4.2.14, we also obtain by in- 
terchanging each of the last m columns with each of the first n 
columns in turn? that 


A AB u TAB A 
det C = det | =i O | = (-1) de | O E | 
= (—1)""*” det( AB). (4.11) 


Since A and B are, in general, nonsquare matrices, we cannot 
invoke Theorem 4.2.13 to conclude that the first determinant in 
(4.11) equals det(A) det(B). 

Consider the Coates digraph D*(C) with vertices {1,2,...,m-+ 
n}, and a linear subdigraph L in £(C) with nonzero weight. In 
order that L have nonzero weight, exactly n — m edges must cor- 
respond to —1’s on the diagonal of —/„. This implies that there 
is a subset K of {1,2,...,n} of cardinality m such that edges in 
L from these vertices go to vertices 1,2,...,m (and thus their 
weights comes from elements of A). It then follows that the edges 
in L from the last m vertices (the vertices n+1,n+2,...,m+n) 
go to the vertices corresponding to the rows of B whose ordinal 
numbers are also in K. 

Let £x(C) be the subset of L(C) consisting of all those linear 
subdigraphs for which the edges from the vertices in K go to the 
vertices {1,2,...,m}. We then have 

det(C) = (-1)""" % w(£rg(C)), 


KCA{12....,n},|K|=m 


where 


w(Lx(C))= % (-1)w(L). 


LELK(C) 


?Note that if we had interchanged like this in the proof of Theorem 4.2.14 
we would have the sign bg instead of (—1)". Since n? is even if and only 
if n is even, we have (-1)"" = (-1)*. 

3See the preceding footnote. 
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We may perform column interchanges so that the columns in K 
come first (followed by the remaining columns in A in the same 
relative order they appear in A) and do similar row interchanges 
so that the rows in K of B come first in C. This implies that 
w(Lx(C)) equals 


Al{1,2,...,m},K] O O 
det O O BIK,{12,....,m}] 
O = O 


By interchanging (n—m)m pairs of columns we see that w(£x(C)) 
equals 


A[f{1,2,...,m}, K] O O 
= (-1)""—™ det O BK, {1,2,...,m}] O |. 
O O =], 


By two applications of Theorem 4.2.14 applied to this last deter- 
minant, we see that w(£x(C)) equals 


TR) det A[{1,2,..., m}, K] det BIR {1, 2, .<,m}|(—1)?™. 


Now, using (4.10), we see that the contribution of Lg(C') to the 
determinant of C equals 
(—1)Pntetmn-m)+n-m) det A[{1,2,..., m}, K] 
det B[K, {1,2,...,m}). 


Because mn +n + m(n — m) + (n — m) = 2mn + 2n — m(m + 1) 
is always an even number, we get (4.9). 


Example 4.2.17 Let 


12 
a=; : andB=|3 4]. 
5 6 
Then 
22 28 
ip =| al d 
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det(AB) = (22)(64) — (28)(49) = 1408 — 1372 = 36. 
Using the Binet-Cauchy theorem we get 


1 2 1 2 1 3 1 2 
det(AB) = det | 4 IE if tact] | 5 | aet| 5 s |+ 


2.3 3 4 
|? Ace T 


det(AB) = (-3)(-2) + (-6)(-4) + (-3)(-2) =6+24+6 = 36. 


and so 


4.3 A Special Determinant Formula 


Let A = [aij] be a matrix of order n, and let A be a variable. The 
matrix A + AZ is obtained from A by adding X to each diagonal 
entry. Thus the weight a; of the loop at vertex i in D*(A) is 
replaced by a; + A in D*(A + AI); there are no other changes in 
the weights. It follows from the definition of the determinant that 
det(A + AI) is a polynomial in A of degree n. In this section, we 
identify that polynomial. 

Let £ be the set of linear subdigraphs of D*(A + AI). By 
definition, 


det(A + AZ) = (-1)" X (-1)™ w(Z). (4.12) 
LEL 

Consider a linear subdigraph L and suppose that L contains ex- 
actly k loops (0 < k < n). Let the vertices with these loops be 
the vertices i1,i2,...,%, with 1 <i) < i2 < --- < ikp <n (if k= 0, 
there are no loops). Then the weight of L is (anin + A) (aiziz + 
A) +++ (Gi,i, + A)G(L), where (L) is the product of the weights of 
the nonloop edges in Z and thus does not depend on A. There are 
2* terms when the product (ani + A)l@isis + A) +++ (aii +A) is 
multiplied out, and these terms are of the form 


PEP ee ee 
AP aji ji Ajaja Qjik-pjk-p° 
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where 0 < p < k and {j1, j2,- - - , Jk-p} is a subset of {41, t2,..., ip}. 
Thus the weight of L satisfies 


w(L) = w(L’), (4.13) 


where L’ is a linear subdigraph of the digraph D*(A’) (an in- 
duced subdigraph of D*(A)) and where A’ is a principal sub- 
matrix? of A of order n — p obtained by striking out p rows 
and p columns, namely, those rows and columns whose indices 
belong to {#1,%2,..., te} \ {J1,J2,---,Jk—-p}, the complement of 
{J1,J2,-++,Jk—p} in {t1,%2,...,%%}. Conversely, every such linear 
subdigraph L’ contributes a term to (4.13). 

Putting this all together we obtain the determinant formula in 
the following theorem: 


Theorem 4.3.1 Let A be a matrix of ordern. Then 


det(A + AI) = > Men =" 


where Cn—p equals the sum of the principal minors of ordern — p 
of A. 


By replacing A with —A in Theorem 4.3.1, and then multiplying 
A — AI by —1 to produce AJ — A, and by using the fact that 
(—1)"t? = (—1)""”, we obtain the following corollary. 


Corollary 4.3.2 Let A be a matrix of ordern. Then 
det (A — AL) = $°(—1)?APen_p, 
p=0 
where Cn-p equals the sum of the principal minors of ordern — p 
of A. Equivalently, 
det(AI — A) = X (-1)" P APcn-p. 


p=0 


4Note: Principal submatrix of A, not principal submatrix of A + AZ. 
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In Theorem 4.3.1 and Corollary 4.3.2, because det A is the only 
principal minor of order n of A, the constant term c, equals the 
determinant of A. The coefficient cı of A"! equals the sum of the 
principal minors of order 1, and this is the trace of A. Thus the 
trace of A is given by 


tr(A) = aq, + a22 + +++ + ann- 


Example 4.3.3 Let 


Then 


det(A +AT) = A? EM + (—1)A + (—1) = à £207 =A = 1. 


4.4 Classical Definition of the 
Determinant 


The determinant formula in terms of linear subdigraphs of the 
Coates digraph, as given in Definition 4.1.1, is not the formula 
that is usually given initially for the determinant. The classical 
formula involves permutations and their signs (+). In this section 
we show that this classical formula is equivalent to our formula. 
Let A = [a;;| be a square matrix of order n, and let L be a 
linear subgraph of D*(A). Because L contains one edge into each 
vertex and one edge out of each vertex, the weight of L is the 
product of n entries of A consisting simultaneously of one element 
from each row of A (of the n edges in L, exactly one of them comes 
into each vertex) and one entry from each column (of the n edges 
in L, exactly one of them comes out of each vertex). If we arrange 
these n entries according to increasing row indices, then we see 
that 
w(L) = aij a2js anjn (4.14) 


88 CHAPTER 4. DETERMINANTS 


where (j1,j2,: -, Jn) is a permutation? of {1,2,...,n}. Thus the 
edges of L are the n edges from vertex jı to vertex 1, from vertex 
ja to vertex 2, ..., and from vertex jn to vertex n. Conversely, 
each product of n entries of A, one from each row and simulta- 
neously one from each column (so a product as given in (4.14)), 
is the weight of some linear subdigraph of D*(A). In formula 
(4.2) for the determinant, w(L) has a sign affixed in front of it, 


namely, (—1)"-°), Let S, denote the set of all n! permutations 
of {1,2,...,n}. Using our notation, we can write 
det A = 5 (TED az; ane Inn: 


(J12, In)ESn 


What we would like to do is determine how to write the 


sign (-1)”"<() in terms of the corresponding permutation 
(ji, J2, aris rl: 
Let o = (ji, j2,---, Jn) be in Sn. An inversion of o is a pair 


k,l of integers with 1 < k < l < n such that 7, > jı Thus an 
inversion represents a pair of integers out of their natural order in 
o. Let #(c) equal the number of inversions of o. The sign of the 
permutation o is defined to be (—1)#). The permutation ø is an 
even permutation if it has an even number of inversions (i.e., its 
sign is +1) and is an odd permutation if it has an odd number of 
inversions (i.e., its sign is —1). 


Example 4.4.1 Let n = 6 and let o = (5,4,1,3,6,2). Then o 
has inversions corresponding to the following pairs of integers out 
of their natural order in o: 


5,4; 5,1; 5,3; 5,2; 4,1; 4,3; 4,2; 3,2; 6,2. 


Since #(o) = 9, o is an odd permutation (its sign is —1). The 
permutation o corresponds to a linear subdigraph L of the Coates 
digraph of a matrix of order 6, where L is a cycle consisting of 
edges from vertices 5 to 1, 1 to 3, 3 to 4, 4 to 2, 2 to 6, and 6 to 1, 

Now consider the identity permutation ı = (1,2,3,4,5,6). 
Then ı has no inversions and so is an even permutation (its sign 
is +1). The permutation ı corresponds to a linear subdigraph of 
a Coates digraph consisting of a loop at each of the six vertices. 


>We now write a permutation of {1,2,...,n} as an n-tuple. 
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In what is to follow we refer to the decomposition into cycles 
of the linear subdigraph L corresponding to a permutation ø of 
{1,2,...,n} as the cycle decomposition of the permutation o, and 
we denote the number of cycles by c(a). Thus c(o) = c(L). 


Lemma 4.4.2 Let (ji, j2,.--,jn) be a permutation of {1,2,...,n}. 
Then #(o) and n — c(o) have the same parity. Therefore, 


re = (—1)#), 


Proof. We prove the lemma by backwards induction on the num- 
ber of cycles of o. To get the induction started, assume that 
c(a) = n, the largest possible number. Then ø is the identity 
permutation, #(0) = 0, and c(o) = n. Hence, in this case, 
n—c(o) = #(o) = 0, in particular, n — c(o) and #(c) have 
the same parity. 

We now assume that c(o) < n. Then o #1 and o contains a 
cycle Ikı to Ess Jka to Es Jks to... to Ta Ikei to Jkis Jri to Ikı 
of t > 2 elements where ky < ka < --- < ką. There must be 
an inversion pair j,, and jr, with r < s and jfk, > jp. If we 
interchange Jp, and Jẹ, in o, we obtain a new permutation 7 such 
that #(7) and #(o) differ by an odd number and, in addition, 
c(t) = c(o) + 1 (and so differ by an odd number). By induction, 
#(r) and n — c(r) have the same parity, and hence so do #(c) 
and c(o). 


Summarizing, we now arrive at a theorem containing the clas- 
sical definition of the determinant. 


Theorem 4.4.3 Let A = |a;;| be a square matrix of ordern. Then 


det A = > (1) FO a (4.15) 


(1.323 In)ESn 


where the summation extends over all permutations (j1, j2, - - - ; In) 
of te integers 1,2,...,n. 
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Example 4.4.4 Let 


oO On O] 


SION 
O0 © 
mee ona 


It is straighforward to check that there are only three permutations 
that give nonzero terms in formula (4.15), namely, (1,2,3,4) (no 
inversions), (2,3, 4, 1) (3 inversions), and (4, 2,3, 1) (5 inversions). 
These permutations are even, odd, and odd, respectively. Hence 


det A = adeg — begp — cdf p. 


We conclude this section by reformulating the classical defini- 
tion of the determinant in terms of the Konig digraph of a matrix. 
Let A = [a;;] be a square matrix of order n. The König digraph 
G(A) has n black vertices and n white vertices. A collection F of n 
edges of G(A), one leaving each black vertex and one terminating 
at each white vertex, is a 1-factor of G(A) (see Section 1.1). The 
weight w(F) of the 1-factor F is the product of the weights of its 
edges. The 1-factors of G(A) are in one-to-one correspondence 
with the terms in the classical determinant formula (4.15). Let 
(ji, Ja;-++;Jn) be a permutation of {1,2,...,n}. To the term 
(—1)# izina] Anz 
in formula (4.15) we let correspond the n edges e1, €2,...,€n of 
G(A), where e; is the edge from black vertex i to white vertex 
ja (i = 1,2,...,n). Because (j1, j2,..., jn) is a permutation of 
{1,2,...,n}, the resulting set of edges {e1, e2, . . . , €n} is a 1-factor 
F of G(A) and its weight is w(F) = a1j,@2j, *** anja. Each 1-factor 
of G(A) arises from a permutation of {1,2,...,n} in this way. 
Let us draw the digraph G(A) so that white vertex i is placed 
directly above black vertex i, as in Figure 4.9. 
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1 2 3 n 

O O Q = G 

® © © — © 

1 2 3 n 
Figure 4.9 


Let q(F) equal the number of pairs of edges in F that intersect 
each other in drawing G(A) in this way. Let ex, joining black ver- 
tex k to white vertex j;, and e;, joining black vertex l to white 
vertex jı, be two edges in F with k < l. Then e, and e; inter- 
sect exactly when 7; > jr. Thus the intersections of edges of F 
are in one-to-one correspondence with the inversions of the per- 
mutation (j1, j2,---;jn), and hence q(F) = #(j1, J2,-.-, In): Let 
F(A) denote the collection of 1-factors of G(A). We then have the 
following reformulation for the determinant of A: 


dt A= Y (-1)w(F). (4.16) 
FEF(A) 


4.5 Laplace Development of the 
Determinant 


In this section we generalize the recursive formula 
det A = X (—1)'" aj; det Ai, (i=1,2,...,n) 
j=1 


for the determinant given in Theorem 4.2.11. 
Let A = [a] be a square matrix of order n. Let 


K = {ki, ka, ..., ky} and L = {h, l2,..., lp} 


be subsets of {1,2,...,n} of the same cardinality v. Recall that 
det A[K, L] is a minor of A of order v. 
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Let K = {1,2,...,n} \ K and L = {1,2,..., n} \ L be the 
complements of K and L in {1,2,...,n}, respectively. (We always 
assume that the indices are written in increasing order.) Then 


AIK, L] = (<1) te +ky+ly+lo+--+ly det AKI 


is the the algebraic complement or cofactor of the minor 
det A[K, L]. This definition generalizes the definitions of cofac- 
tor and algebraic complement given in Section 4.4 for elements 
(i.e., minors of order 1). The generalization of Theorem 4.2.11, 
the general Laplace development of the determinant, is given in 
the next theorem. It asserts that the determinant of a matrix 
A can be evaluated by first choosing a set K of rows and then 
summing up the products of each minor formed out of those rows 
with its algebraic complement. A similar development results by 
replacing rows with columns. 


Theorem 4.5.1 Let K C {1,2,...,n} with |K| =v. Then 
det A = 5 det A[K, LJA[K, L], (4.17) 


LC{1,2,...,.n},|L|=v 


where, as indicated, the summation is taken over all the ‘@ subsets 
L of {1,2,...,n} of cardinality v. 


Proof. We use the formula given in (4.16), 


det A= Y (-1)™w(F), (4.18) 
FEF(A) 


that evaluates the determinant in terms of the König digraph 
G(A). 

Let K = !kı,ka,...,k,}. A 1-factor F in F(A) contains one 
edge leaving each black vertex. Let the edges of F leaving the black 
vertices with labels in K terminate in those white vertices whose 
set of labels is L = {l,lo,...,l,}. We partition the 1-factors in 
F(A) into C) sets by putting in F,(A) all those 1-factors F with 
the same L. Thus we may write (4.18) as 


det A = y NR). 


LC{12,...n},|L|=v FCF, (A) 
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Each 1-factor F in FL(A) consists of a 1-factor Fr in the subgraph 
of G induced on the black vertices with labels in X and white ver- 
tices with labels in L, and a 1-factor F} joining the black vertices 
with labels in the complement K and white vertices with labels in 
the complement L. We see that F; corresponds to a 1-factor of 
G(AIK, L]) with the same weight, and Fẹ corresponds to a 1-factor 


of G(A[K, L] with the same weight. We also see that 
gE) = q(Fi) ta +4, 


where t is the number of intersections of edges in Fr with edges 
in Fr. If we switch the places of a black vertex in K with a 
black vertex in K immediate to its left, we reduce the number of 
intersections of edges in Fg with edges in Fy by 1. Hence, by 


r= (kı — 1) + (k2 — 2) + -+ (k -v) 
switches, we bring kı,ka,...,k, to the first v positions, reducing 
the number of intersections by r. Similarly, by 


6 eE] 


switches of white vertices, we bring Iı,lo,...,l, to the first v po- 
sitions, reducing the number of intersections by s. We conclude 
that 


t = (kı—1)+ (kz—2)+- -+ (ky — v) + 
(h-2)+(l-2)+-+(,-v) 


= kit kət: +ky+li+la+:--+l,+ (an even number). 


Now we obtain that (-1)®w(F) equals 
(1) Ww(Fr) . (—1)* Hka++kytlitla+ 4 (1) Dw (F5), 


from which formula 4.17 now follows. 


Example 4.5.2 Let 


N OW e 
Om.. oo 
O we N 
=. NOOO 
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In Theorem 4.5.1, let X = {1,2}. Then 


det A = (ij det 


Q 
© 
ot 


L (—1) 1+2+1+3 det 


Q 
© 
er 


Q 
© 
Sr 


BE (—1) 1+2+2+3 det 


L (—1) 142+2+4 det 


Q 
© 
er 


Se (—1) 1+2+3+4 det 


Q 
© 
et 


ee Lo GA Lo O | 
Q 
(@) 
oe 
Sr ab Am Am oa 


| 
| 
+ terre | 
| 
| 
| 


FNM FO mo UUme Wwe We 
Coo oO FN COO FN FO 
DO NO NO OF © m OW 
OH OW eN OW eN eN 
Lo 1 Lo 1 Lo 1 Lo. Le 1 Li 1 


= 3+5+0+8+0-0 
= 16. 


4.6 Exercises 


1. 


Let A = [a,;| be the matrix of order 2n +1 such that a;; = 0 
whenever i + j is an even integer. Prove that det A = 0. 


Let A be a matrix of order n and let k be a positive integer. 
Show that if det(A*) = 0, then det A = 0. 


Calculate the determinant of the matrix 


a 1 1 1 
1 a 1 1 
An(a1, Q2, .., an) = 1 1 a3 1 
1 1 1 An 


in which all the off-diagonal entries equal 1. 


4.6. 


10. 
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. Prove that 

ao Qi Q2 *** An-1 Gn 

-1 x 0 0 0 n 

det 0 —] x 0 0 = 5 [FR 

: s i=0 

0 0 0 —] x 
Let 

0 —1 0 
A= 0 0 -1 
—3 -l 3 


Find all values of X for which det(A + AJ3) = 0. 


. Let A be a matrix of order n with a k x (n — k +1) zero 


submatrix for some k with k = 1,2,...,n — 1. Show that 
det A = 0. 


. From the fact that the matrix of order n > 2 with all en- 


tries equal to 1 has determinant equal to 0, conclude that 
the number of odd permutations of {1,2,...,n} equals the 
number of even permutations. 


. Compute the determinant of the matrix 


1 2 —3 1 
20 1 0 
3 1 2 Sah 
01 0 3 


using the Laplace development along rows {2,4}. 


. Compute the determinant of the matrix in the previous ex- 


ercise using the Laplace development along columns {1,2}. 
Calculate the determinant of the matrix A3 B? A?, where 


1 0 3 03 1 
A=|2 -1 —2 | anddB=|10 -3 
1 3 1 2 1 4 
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11. 


12. 


13. 


14. 


15. 
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Prove that 
1 t z yal 
2 n—1 
1 9.5 x i 
2 n— 
det | 1 23 23 T3 = ]] (j-r). 
: 1<i<j<n 
2 n-1 
1:42, ita, es. ee 


This determinant is called the Vandermonde determinant. 


A matrix A = [ay] of order n is skew-symmetric provided 
that aij + a;; = 0 for alli # j. Thus each entry on the 
main diagonal of A equals 0. Prove that the determinant of 
a skew-symmetric matrix of odd order n is 0. 


Use the Binet-Cauchy formula to evaluate the determinant 


of 
102 8 
2 1 0 1 


Use the Laplace development and the Binet-Cauchy formula 
to show that if A is an m by n matrix and B is ann by m 
matrix, then 


1 
0 
3 
1 


oOo F N 


O A 
det | BO | = det(AB). 
Let a 4 b. Show that 
a+b ab O > 0 0 
1 a+b ab >- 0 0 
0 1 a+b --- 0 0 n+1 _ pn+l 
det} o ee e ie 
: we : : a—b 
0 0 O +--+ a+b ab 
0 0 0 > 1 a+b 


Chapter 5 


Matrix Inverses 


In this chapter we define the inverse of a square matrix and study 
some of its properties. We give a formula for the inverse in terms of 
determinants, and then give an interpretation in terms of graphs. 

Section 5.1 introduces the concept of the adjoint of a square 
matrix and establishes some of its properties that enable a con- 
struction of the inverse in Section 5.2. It is proved that a square 
matrix has an inverse if and only ifthe determinant of the matrix 
is different from zero. In Section 5.3, cofactors of matrix entries 
are interpreted by special subgraphs of the Coates digraph associ- 
ated with the matrix, and this finally leads to a graph-theoretical 
formula for the entries of the inverse. 


5.1 Adjoint and Its Determinant 


Let A = [a;;| be a matrix of order n. We recall from Chapter 4 
that the cofactor a;; of the element a;; is defined by 


Qij = (1) det Aij, 
where A;; is the matrix of order n — 1 obtained from A by delet- 


ing row i and column 7. We also recall the developments of the 
determinant along rows and columns given by 


det A = 5 QijQij = 1) ai; det Ay (i = 1, 2, jenes ru): 
j=1 j=1 


(5.1) 


97 
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and 


n 


det A = 5 QijQij = N) "a;i; det Aij (j = 1, 2, Sud ci): 
i=l i=l 
(5.2) 
Consider the sum 


X ajari, where k Zi. (5.3) 
j=l 


The cofactors œg; occuring in this sum with k # i do not depend 
on row k of A, since row k is deleted in their definitions. Thus we 
can replace row k of A by any row whatsoever without changing 
the ar; and (5.3). If we replace row k by row iin A giving a matrix 
A’ in which row i of A appears twice, then no change occurs in 
(5.3), but now it represents the development of the determinant 
of A’ along its ith row. Because A’ has two identical rows, its 
determinant equals zero. Thus we have 


0=S ajar; = 2 a1) det Ay, (k Æi), (5.4) 
j=l 
and in a similar way we conclude that 
0 = Y ayon = = Yal- Herder Aix, (E47): (5.5) 


We summarize what we have shown in the next theorem. 
Theorem 5.1.1 Let A = [a;;| be a matrix of ordern. Then 


A det A ifk=i 
je 2 
ml Pur i 0 ifkFi. 


Similarly, 


; det A ifk=j 
ft _1)\tt+k et ’ 
> a;;(-1) "det Ar = | 0 Pa 


5.1. ADJOINT AND ITS DETERMINANT 99 


In words, Theorem 5.1.1 asserts that the sum of the products 
of the entries in a row, respectively, column, of a matrix times the 
cofactors of the entries in a different row, respectively, column, 
equals zero. 

We now make a definition that will enable us to write the 
equations in Theorem 5.1.1 in a more compact matrix form. 


Definition 5.1.2 The adjoint of the matrix A = a,;| of order n 
is the matrix 


TE 
Qil Ay2 °° Ain 
f Q21 Q22 `° Aan 
adj A = 
Qni An2 *** Ann 


obtained by replacing each entry a;; of A by its cofactor a;; = 
(—1)'* det A;; and then transposing the resulting matrix. 


Example 5.1.3 The adjoint of the general matrix 
ab 
= 


of order 2 is given by 


T 
est N cee 
a =| f | =|! al 


The adjoint of the matrix 


is given by 


ad AS) 4 re 1 4 
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The adjoint of the identity matrix I» is /,„ itself. This follows 
since deleting row į and column j with 7 # j always results in 
a matrix with a zero row (and a zero column) and hence in a 
matrix whose determinant equals 0, while deleting row and column 
i results in /„_ı, a matrix with determinant equal to 1. 


Let adj A = [(;;| so that Bij = aj; for each i and j. Then the 
equations in Theorem 5.1.1 can be written in the following forms: 


k k ; det A ifk=i 
a ee al cate = f 
2 u 2 a;;(—1)” "7 det Ar; | 0 Er 


and 


2 2 ; det A ifk=i 
any (1 )ttk —— j 
> Aria; 2 der; i 0 ie 


These two sets of equations now give us the matrix equation 
in the next theorem. 


Theorem 5.1.4 If A is a square matrix of order n, then 


A(adj A) = (adj A)A = (det A)I,. 


Example 5.1.5 Continuing with the matrix A of order 3 in Ex- 
ample 5.1.3, we calculate that det A = 9, 


1.0.2 Be 
A(adj A)=]2 ı0o||-2 ı 4|=97,, 
De T 


and 
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5.2 Inverse of a Square Matrix 


We begin with the definition of an inverse of a matrix. 


Definition 5.2.1 Let A = [a;;| be a square matrix of order n. A 
matrix B = [b;;| of order n is an inverse of A provided AB = 
BA = I„. An inverse of a matrix A is denoted by A=!. If the 
matrix A has an inverse, then A is called invertible and is also 
sometimes called nonsingular. A singular matrix is a square matrix 
that does not have an inverse. 


In order that the notation for the inverse of a matrix not be 
ambiguous, we need to know that if a matrix has an inverse, then 
it has only one inverse. This, as well as some elementary properties 
of inverses, are contained in the next theorem. 


Theorem 5.2.2 Let A be a square matrix of ordern. Then: 
(i) A has at most one inverse. 


(ii) A has an inverse if and only if det A# 0. Ifdet A #0, then 


= 1 ; 
Al= det A A). 


(iii) If B is a matrix of order n such that AB = I, then also 
BA = I, and B is the inverse of A. 


Proof. (i) Suppose that both B and C satisfy the definition 
of an inverse of A. Then we calculate that 


B=BI, = B(AC) =(BA)C = „C = C. 


Thus B = C and A has at most one inverse. 
(ii) First suppose that A has an inverse B. Then AB = I„. By 
the multiplicative property of determinants, 


det A det B = det AB = det J, = 1. 
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Hence det A # 0. Conversely, suppose that det A # 0. Then by 
Theorem 5.1.4 we have 


A(adj A) = (adj A)A = (det A) In. 


Because det A # 0, we get 


A(z qledi A) = (qed A) Ash 


and hence 
2i 


1 
= —— (adj A). 
a ) 


(iii) Suppose B is a matrix of order n with AB = I„. By the 
multiplicative property of determinants again, 


det A det B = det AB = det Ip = 1. 


Hence det A # 0 and, by (ii), A has an inverse A~!. We calculate 
that 


A= AL, = AAB) = (A7 AB = hB =B. 


Hence B = A7!. 


Example 5.2.3 Continuing with the matrix 


1 0 2 

2 1 0 

0 2 1 
in Example 5.1.5, we have that 


ph te. 4 -2 ] | 1/9 4/9 -2/9 | 
At==|-2 1 4|=|-2/9 1/9 4/9]. 
9 | dr 1 | | 4/9 —2/9 1/9 | 


A= 
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5.3 Graph-Theoretic Interpretation 


In this section we give a formula for the inverse of an invertible 
matrix in terms of its Coates digraph. In (ii) of Theorem 5.2.2, the 
elements of the inverse are expressed in terms of the cofactors and 
the determinant. In Chapter 4, we evaluated the determinant in 
terms of the Coates digraph, and so we now consider the cofactors 
from the viewpoint of the Coates digraph. 

First we introduce the idea of a 1-connection of a digraph. 


Definition 5.3.1 Let D be a digraph with vertices 1,2,...,n. 
Let i and j be vertices of D. A 1-connection of vertex i to vertex 
j is a spanning subdigraph D[i — j] of D with the following 
properties: 


If i Æ j, then 
(i) exactly one edge leaves, but no edge enters, vertex 7; 
(ii) exactly one edge enters, but no edge leaves, vertex j; 


(iii) for each vertex k Æ i, j, exactly one edge enters, and exactly 
one edge leaves, vertex k. 


If i = j, then 
(i) no edges enter or leave vertex i; 


(ii) for each vertex k 4 i, exactly one edge enters, and exactly 
one edge leaves, vertex k. 


It follows from the definition that a l-connection Di — j] is a 
spanning subdigraph of D consisting of a path from i to j (this 
path is a path of length 0, that is, it is the single vertex i, if i = 7) 
and a possibly empty collection of pairwise vertex disjoint cycles 
having no vertex in common with the path. We let c(D[i — jl) 
denote the number of cycles of Dfi — j]. As usual, if D is a 
weighted digraph, then the weight w(D[i — j]) of D(i — j) is the 
product of the weights of its edges. 
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In Figure 5.1, a digraph is displayed with several 1-connections 


u 


D D(1— 1) 
1 Í 1 2 
3 4 3 4 
D(2 — 4) D(1 — 4) 
Figure 5.1 


Let A = |a;;] be a square matrix of order n and let D* = D*(A) 
be the Coates digraph of A whose edges are weighted by the entries 
of A. There is a close relationship between the linear subdigraphs 
of the digraph D* and their 1-connections. Let L be a linear 
subdigraph of D*, and let L contain the edge from vertex j to 
vertex i of weight a;;. Suppose that we delete from L this edge 
from vertex j to vertex i. It follows from the definitions of a 
l-connection and of a linear subdigraph that the result is a 1- 
connection D*|i — jl. Ifi = j, the edge deleted is a loop at vertex 
i. The following relationships hold between the number of cycles 
and the weight of the linear subdigraph L and a corresponding 
1-connection D*{i — j]: 


e(L) = e(D*li > jl) + 1, (5.6) 
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and 

w(L) = ajw(D*|i — 5]). (5.7) 
Conversely, if D*|i — j] is a 1-connection from 7 to j, then, by 
adding the edge from vertex j to vertex i, we obtain a linear sub- 
digraph of D satisfying (5.6) and (5.7). 

According to Theorem 4.2.11, the cofactor a;; of the element aij 
of A is the coefficient of a;; in the development of the determinant 
of A along row i. Let £;; denote the set of all linear subdigraphs 
of D* containing the edge from j to i. Then, from the definition 
of a determinant as given in (4.1), we have 

ety = (=I) D ED’ wD), (5.8) 
LEL;j(A) 
where the summation extends over all linear subdigraphs L in 
£L;;(A). Using (5.6) and (5.7), we get from (5.8) that 
= ("DPD N), 69) 
D*[i3] 
where the summation extends over all 1-connections D*[i — j] of 
D* from i to j. 

We now obtain the following formula for the entries of an in- 

vertible matrix. 


Theorem 5.3.2 Let A = [a;;| be an invertible matrix of order n, 
and let A~* = [a};]. Then 
of, = Verge (D"E > Al) 


i ven) 
: Vrec(a)(-l)Mw(L) 


(5.10) 
Proof. By (ii) of Theorem 5.2.2, 


1 
ai . 
= dj A 
JA ae 
that is, 
2 det A’ 


Substituting for a,;; the formula given in (5.9) and for det A the 
formula given in (4.1), we get (5.10) by cancellation of (—1)”. 
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Example 5.3.3 Let 


0 1 0 0 0 
0 0 1 0 0 
A= : 
0 0 0 0 1 
An —An—-1 —An-2 —Gn_3 "01 


whose Coates digraph D*(A) is given in Figure 5.2, where all hor- 
izontal edges from right to left have weight 1. 


—An 


Figure 5.2 


The digraph D*(A) has only one linear subdigraph, and we get 
det A = (-1)"(-1)!(-a„) = (-1)"a„. The only 1-connections of 
D*(A) are the l-connection D*(A)[{i — 1] with the weight —a,_; 
for i = 1,2,...,n — 1, the 1-connection D*(A)[n — 1] with the 
weight 1, and the I-connection D*(A)|i — i+ 1] with the weight 
—an fori =1,2,...,n. Hence 


An-1ı ZAn-2 n3 ,,. u L 
An An An An An 
1 0 0 >. 0 0 
0 


A'= 0 1 O > 0 


5.4. EXERCISES 107 


5.4 Exercises 


ib 


Let A be an invertible matrix of order n and let B be a 
matrix of order n. 

(a) Prove that det A~! = (det A)~! 

(b) Prove that AT is invertible and (ary! Ar. 

(c) If det A = 3 and det B = 4, what is det(A~!BA®B?)? 


. Determine the inverse of a permutation matrix. 


. Prove that a triangular matrix is invertible if and only if it 


does not have any zeros on its main diagonal. 


. Prove that the inverse of an invertible triangular matrix is 


triangular. 


. Let A bea matrix of order n. Prove that if A is not invertible, 


then neither is adj A. 


. Let A be a matrix of order n. Prove that det(adj A) = 


(det A)”. 


. Let A and B be invertible matrices of order n. Prove that 
AB is invertible, indeed, that (AB)~! = B1A"!. 
. Prove that 
aıı Q12 °°" Qin Ly 
Q21 Q22 *** Gan T2 
det . . . 
Ani An2 *** Ann In 
ı m «+++ Lp O 
equals 


- (3 anti + XO (ta) a) 


1<i<j<n 


Here a;; is the cofactor of the element a;; in the matrix 
A = [a,;| of order n. 
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9. Find the inverses of each of the following matrices of order 
n: 
(a) A= [ais], where Qij = Qiðij; 
(b) B= [b;;], where bij = Ina: 
(c) C= lent. where bij = Yiðij a Ôi j—1- 


10. Determine the inverse of the matrix 


0 


OO. O n 
Coo G FR 
on m 

oF Ooo 


Chapter 6 


Systems of Linear 
Equations 


First, we give a brief introduction to the solution of a system of m 
linear equations in n unknowns. In particular, we introduce the so- 
called reduced row-echelon form of a matrix and explain how it can 
be used in solving a system of linear equations. Then, using results 
from Chapter 5 on the adjoint and inverse of a square matrix, we 
derive an explicit formula (known as Cramer’s formula) for the 
solution of a linear system of n equations in n unknowns whose 
coefficient matrix is invertible. We then turn to graph-theoretical 
techniques for solving systems. In Section 6.3 we show how to use 
the Coates digraph (flow digraph) to solve the linear system. In 
the next section, we discuss the signal flow digraph approach (a 
variation of the previous technique) for solving a linear system. 
These two techniques, although valid in general, are efficient if the 
system matrix is sparse, that is, if it contains a lot of zero entries 
and the other entries are variables. Finally, in the last section we 
explain how to use graph-theoretical tools to treat systems with 
sparse matrices whose entries are given numerically. 


6.1 Solutions of Linear Systems 
We begin with some definitions. 
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Definition 6.1.1 Let F bea field. A linear system of m equations 
in n unknowns is a system 


G11, 01282 +" + Ging, = bi 
Q211 } A22 T92 poera Ann = bə 
amili + Am2t2 +`- F amnin = bm, 
or, in matrix form, 
Az = b, (6.1) 
where 
aıı Q12 Qin 
Q21 Q22 Q2n 
A= : 
Ami Gm2 ‘°° Amn 


is the matrix of coefficients and 


Ly by 

x b 
= i and b = i 

Tn bm 


are the matrix-columns of unknowns and constant terms, respec- 
tively. The solution set of the system (6.1) is the set of all column 
vectors x = u such that Au = b. The system may be consistent and 
have at least one solution, or inconsistent and have no solutions. 
If b = 0, then (6.1) is called a homogeneous system; otherwise, 
it is called an inhomogeneous system. The homogeneous system 

Ax = 0 is always consistent as x = 0 is always a solution; it is for 
this reason that x = 0 is called the trivial solution of Ax = 0. The 
solution set of the homogeneous system Ax = 0 is called the null 
space of the matrix A. The null space of A is always nonempty 
as it contains the zero vector. 


The null space of a matrix is a subspace of F”. This follows 
since if u and v are in the null space of A, then Au = 0 and Av = 0 
imply that 


A(cu + dv) = cAu+dAv = c0 + d0 = 0 
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for every choice of constants c and d. In the next theorem we 
relate the null space of A to the solution set of Ax = b. 


Theorem 6.1.2 Consider a linear system Ax = b. Let x = w be 
a particular solution of Ax = b, and let U be the null space of A. 
Then the solution set of Ax = b is the set 


wt+U={w+u:ueU} 


of all vectors obtained by adding to w a vector u in the null space 
of A. 


Proof. Let u be any vector in the null space of A. Then 
A(w+u) = Aw+ Au = 6b+0 = b, and thus x = w +u is a solution 
of Ax = b. Conversely, let x = w’ be any solution of Ax = b, and 
let u = w’ — w. Then 


Au = A(w' — w) = Aw’ — Au=b-b=0, 


and so u is a vector in the null space of A. Hence w = w + u, so 
that w’ has the required form. 


It follows from Theorem 6.1.2 that by knowing one solution of 
Ax = b and all solutions of Ar = 0, we can obtain all solutions of 
Ax =b. 

In addition to the null space of A there are two! other subspaces 
that we associate with A. The first is the row space of A consisting 
of all vectors spanned by the rows of A; the second is the column 
space of A consisting of all vectors spanned by the columns of A. 
Let the rows of A be aj, Q2,...,Q@m. It follows from the definition 
of a dot product that the null space of A consists of all those vectors 
u = [uy u2 ... up|” such that af - u = 0, equivalently, u = 0, 
for i = 1,2,...,m. Because the row space of A is spanned by 
Q1,Q2,...,QAm, the null space of A consists of all those vectors u 
such that a - u = 0 for all vectors a in the row space of A.? 


1 Actually there are three, but only two of them concern us in this brief 
introduction. The third is the null space of the transpose AT of A, that is, all 
vectors u in F” such that ATu = 0, equivalently, u" A = 0. 

?We can turn this around and say that the row space of A consists of all 
those vectors a such that a? - u = 0 for all vectors u in the null space of A. 


112 CHAPTER 6. SYSTEMS OF LINEAR EQUATIONS 


Definition 6.1.3 Let A = [a;] be an m by n matrix. The row 
rank of A is the dimension rr(A) of the row space of A, equiva- 
lently, the maximum number of linearly independent rows of A. 
The column rank of A is the dimension cr(A) of the column space 
of A, equivalently, the maximum number of linearly independent 
columns of A. The nullity of A is the dimension n(A) of the null 
space of A. 


We now show how to find a basis of the row, column, and null 
spaces of a matrix A. 


Definition 6.1.4 Consider the linear system (6.1) of m equations 
in n unknowns. There are three types of elementary operations 
that can be performed on (6.1) without changing its set of solu- 
tions. These are 


I. Switch the order of two equations. 
II. Multiply both sides of one equation by a nonzero? scalar c. 
III. Add a multiple c of one equation to a second equation. 


Let 
A'= [Ab] 


be the m by n+ 1 augmented matriz of (6.1) obtained by affixing 
to the coefficient matrix A the column vector b as a last column. 
Then the elementary operations I, II, and III, when applied to A’ 
or A, are called elementary row operations (EROs for short) and 
can be described as follows: 


I. Switch the order of two rows. 
II. Multiply arow by a nonzero scalar. 


III. Add a multiple of one row to a second row. 


3]f we were to multiply by zero, we would wipe out the equation, that is, 
replace it with 0 = 0. 
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It is evident that operations I, II, and III do not change the 
solution set. This is because each of them is reversible. To reverse 
a type I operation, switch the equations (rows) back. To reverse a 
type II operation, multiply the same equation (row) by the recip- 
rocal of c. To reverse a type III operation, add the multiple —c of 
the first equation (row) to the second equation (row). 

EROs can be performed using matrix multiplication. Let 
Im(i, j) be the matrix obtained by switching row i and row j of the 
identity matrix Im of order m, where 1 <i < j < m. Let In(c-i) 
be the matrix obtained by multiplying row i of Im by the nonzero 
scalar c, where 1 <i < m. Let Im(c-i +j) be the matrix obtained 
from Im by adding c times row i to row j, where 1 <i Aj <m 
and c is a scalar.* The matrices of these three types are called 
elementary matrices. 


Example 6.1.5 To illustrate, we have 


1000 1000 
nal, eS | and 
0010 0010 
0100 0001 
1000 
0103 
I,(8-4+2) = 0010 
0001 


The above remark about reversibility of elementary operations 
(EROs) can be restated in matrix terms as follows: 


a)? = Imi, J), Im(e i Oe = Tae: i i), and 


Im(e-t+ j) = Im(-c-i+ j). 


In particular, the inverse of an elementary matrix is an elementary 
matrix of the same type. 


‘It’s acceptable that c = 0, but then [,,(0-i+ j) = Im. 
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Using EROs, a system of linear equations (equivalently, its 
augmented matrix) can be reduced to a simple form from which 
the solution set is then evident. The EROs of type III are used to 
eliminate variables from equations, that is, make the coefficients 
equal to zero. 


Definition 6.1.6 Let R be an m by n matrix. Then R has a 
reduced row-echelon form, abbreviated rre-form, provided each of 
the following properties hold: 


(i) Zero rows, if present, come last. 


(ii) The first nonzero entry in each nonzero row is a 1, called a 
pivot, and every other entry in the column of that 1 equals 
0. 


(iii) If there are k nonzero rows and the pivot 1 in row i is in 
column p;, then 1 < py < po < --- < pp < n. (The matrix R 
contains the identity matrix I as a submatrix.) 


Example 6.1.7 A zero matrix O is already in rref-form. The 
matrix 


13 00 4 0 
001020 
000130 
00000 1 
000000 
000000 


is in rre-form. 


In the next theorem we show that every matrix may be put in 
rre-form using EROs. This process is often referred to as Gaussian 
elimination. 


Theorem 6.1.8 Let A = [a;;| be anm by n matrix. Then there 
exists a sequence Pı, Po,...,P; of elementary matrices such that 
P, --- PaP,A is a matrix in reduced row-echelon form. The ma- 
trix P = P,- -- PaP, being a product of invertible matrices, is an 
invertible matrix. 
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Proof. We briefly describe the constructive proof in terms of 
EROs. Let the first nonzero column of A be column pı. Using a 
type IERO, if necessary, we may bring a nonzero entry in column 
pı to row 1. Using a type II ERO, if necessary, we may make 
that nonzero entry 1. Using type III EROs as necessary, we ob- 
tain a matrix with all other entries in column p; equal to 0. We 
now repeat with the submatrix determined by rows 2,...,m and 
columns pı + 1,...,n, obtaining a matrix with a 1 in row 2 and 
column pa > pı with all of the other entries in column 2 equal to 
0. Using a type III ERO we can also make the entry in row 1 and 
column pz equal to zero. We then consider the submatrix formed 
by rows 3,...,m and columns pa-+1,...,n and continue until only 
zero rows remain. 


Example 6.1.9 When we obtain the rre-form of the augmented 
matrix [A b] of a system of linear equation, we can immediately 
read off its set of solutions, or conclude that the system is incon- 
sistent. The system will be inconsistent exactly when one of the 
pivots occurs in the last column, the column corresponding to b. 
In this case, one of the equations becomes the contradictory equa- 
tion 0 = 1. In case of a homogeneous system, which is always 
consistent, we can use the coefficient matrix A itself rather than 
the augmented matrix [A 0]. 

For instance, suppose the rre-form of the augmented matrix of 


a system Ax = b of linear equations in unknowns 2, %9,..., £e iS 
1 300 4 8 
0010 2 0}1 
000 1 3 0]5 
00000 113? 
00000 010 


00000010 


where we have drawn a vertical line to separate the last column 
corresponding to b from the other columns. Thus, with elementary 
operations, the original system of equations has been reduced to 
the following system, with the same set of solutions: 


%ı +32 +40; = 2 
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T3 T 3X5, = 5 
v4 T 3X5 = 
I = 


The variables corresponding to the pivots are £1, £3, £4, £ and can 
be solved in terms of the free variables x2, x; as follows: 


zı = 2-30 - 45 

z3 = 9-33 

x4 = 5-325 

ig. Se g (6.2) 


Thus x2 and x; can take any values with 71,273,274, and zę deter- 
mined by (6.2). The dimension of the null space is the number 2 
of free variables. 


The reduction to rre-form gives rise to some important conse- 
quences, which we now elaborate on. 

Let A be an invertible matrix of order n. Let R be the rre-form 
of A so that, by Theorem 6.1.8, there is a product of elementary, 
and so invertible, matrices, P = P,---P)P,; such that PA = R. 
Thus R, being a product of invertible matrices, is also invertible. 
Hence R cannot have any zero rows. Since A is a square matrix, 
this means that R = I„, and now PA = I, implies that A~! = 
P,- -- PaP. Thus the inverse of an invertible matrix can be found 
by applying EROs to reduce A to J,. One way to do this is to 
apply these EROs to the matrix 


MATa 
The result is 
E A 


(If the rre-form of A does not equal J,,, then A is not invertible.) 
From the above discussion, we now obtain the following corollary. 
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Corollary 6.1.10 A square matrix of order n is invertible if and 
only if it is a product of elementary matrices. 


The row space of an m by n matrix A consisting as it does of 
all the linear combinations of the rows of A is the set of all vectors 
of the form u” A as u ranges over all m by 1 vectors. Let P be an 
invertible matrix of order m. Then 


u” (PA) = (u P)A = vA, 


T 


where v! = uf P; conversely, 


vT A =v"(PP)A = (v PH (PA) = u” (PA), 
where u! = v? P~!. These two equations imply that the row space 
of A is the same as the row space of PA for any invertible matrix 
P, and hence the row rank of A equals the row rank of PA. In 
particular, the row rank of A equals the row rank of its rre-form, 
and this is easily seen to be the number of pivots (number of 
nonzero rows). In general, the column space does change, but the 
linear dependence or linear independence of the columns does not. 
Another way of saying the same thing is that the null space of A 
equals the null space of PA for every invertible matrix: 

If Au = 0, so does (PA)u = 0; conversely, if (PA)u = 0, then 

multiplying by P~! we see that Au = 0. 
From this we conclude that the column rank of A equals the col- 
umn rank of PA. In particular, the column rank of A equals the 
column rank of its rre-form, and this is also easily seen to be the 
number of pivots. We conclude that the row rank and column rank 
of a matrix are always equal and the common value is the number 
of pivots in its rref-form. This common value is called the rank of 
A and is denoted as r(A). The dimension of the null space is the 
number of free variables, and this equals n — r(A). 

Suppose the m by n matrix A has an invertible submatrix of 
order k. Then the k rows (and the k columns) of A containing 
this submatrix are linearly independent, and so the rank of A is 
at least equal to k. Conversely, suppose the rank of A equals k. 
Then A has k linearly independent rows and these form a k by n 
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submatrix A’ of A with rank equal to k. Since the row rank of 

A’ equals its column rank, A’ has k linearly independent columns 

forming a square submatrix B of order k of A’ and hence of A. 

The matrix B has linearly independent rows and so its rre-form is 

I,. Hence B is invertible and det B 4 0. It follows that the rank 

of A equals the largest order of a submatrix of A that is invertible. 
In the next theorem we collect some of these observations. 


Theorem 6.1.11 Let A be an m by n matriz. 


(i) The row and column ranks of A are equal. The common 
value is the rank of A, denoted r(A). 


(ii) The rank of A plus the nullity of A equals n: 
r(A) + n(A) =n. 
(iii) The rank of A equals the largest integer k such that A has a 


submatrix of order k whose determinant is not zero, equiva- 
lently, the largest order of an invertible submatrix of A. 


6.2 Cramer’s Formula 


Consider a linear system of n equations in n unknowns 


Q11%1 + aizo +-+- Fam. = bi 
a211 + A2222 +: + amn = ba 
Aniti + anot F: Gage = bn, 
or in matrix form 
Az = b, (6.3) 
where 
aıı Q12 °'" Ain 
Q21 Q22 *** Qn 


Ani An2 ``: Ann 
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is the matrix of coefficients and 


Tı by 

x b 
T= ; and b = & 

In bn 


are the column vectors of unknowns and constant terms, respec- 
tively. 

We now assume that the coefficient matrix A is invertible. 
From Chapter 5 we know that 


| 1 
a 
where 
eee Se ES j+i Ji 5 i tae 
det ra ( 1) det A (i, J ED ae ‚n) (6.4) 


and A,;; is the submatrix of A of order n — 1 obtained by deleting 
row j and column 7. Multiplying both sides of (6.3) by A=!, we 
get 

AT!(Ar) = A'b. 


Since A!A = J, and I„x = x, we get that 
x= Atb 


is the unique solution of (6.3). To find the solution we multiply 
b on its left by A~!. Using the formula for the entries of A~! as 
given in (6.4), we obtain 


ik . det A; 1. ee 
i= Sp i E —1)3**p, A, ; 
fori = 1, 2.0440 


Let A® be the matrix of order n obtained from A by replacing 
its ith column with the column vector b. As discussed in Chapter 
4, the cofactors of the entries in column i of A® do not depend on 
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what is actually contained in column 7. This implies that these co- 
factors are the same as the corresponding cofactors of the elements 
in column 2 of A. It follows that by developing the determinant of 
A along column i, we get that 


det AP = X (—1)i tb; det A; (¢ =1,2,...,n). (6.6) 
j=l 


Comparing (6.6) with (6.5), we see that 


det A® 


= Velden): 
Ti det A (i zer! n) (6.7) 


This is Cramer’s formula, which we summarize in the next the- 
orem. It expresses the solution of Ar = b as the quotient of two 
determinants. 


Theorem 6.2.1 Let Ax = b be a system of linear equations in 
n unknowns where the matrix A of coefficients is invertible. Let 
A” be the matriz of order n obtained from A by replacing its ith 
column with the column vector b, (1 < i < n). Then Ax = b has a 


unique solution x = (£1, 22, ..., £n)! given by 
det AM 
i= ’ Ba 1, 2; vee N 
7 det A ( ) 


Example 6.2.2 We solve the system of linear equations 


£1 +31 — 43 = 


0x1 — 27 323 


3X21 + £2 + I = —1. 


The matrix of coefficients is 


A=|0 -l 
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with determinant equal to 11. Thus A is invertible and the unique 
solution x = (x1, £2, £3)” is given by 


2 3-4) 4, 
—1 1 1 
1 2 4 91 
z= (1/11)det |0 0 3|=7, 
3 —1 1 
1 3 2 7 
z3 = (1/11)det | 0 -1 0|= T 
3 1 —1 


6.3 Solving Linear Systems 
by Digraphs 


Electrical engineers have developed a series of methods for solving 
systems of linear algebraic equations that appear in the theory 
of electrical circuits, control theory, and other areas. We start in 
this section by explaining the flow graph method of Coates, known 
since the 1950s. In the next section we will desccribe the signal 
flow graph technique of Mason. 

As in the previous section, we again consider a system 


Ax =b, equivalently — b + Ar = 0 (6.8) 


of n linear equations in n unknowns written as one matrix equa- 
tion. The matrix A = |a;;| is a square matrix of order n. 


Definition 6.3.1 The Coates digraph (also called the flow digraph 
or simply flow graph) of the linear system (6.8) is the Coates di- 
graph D*(—b, A) of the matrix [—b A] with n + 1 vertices labeled 
0,1,2,..., whose directed edges are those given by the following 
rules: 
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(i) For each i,j with 1 < i,j < n and a;; # 0, there is an edge 
from vertex j to vertex i of weight aj;. 


(ii) For each i with 1 <i < n and b; 40, there is an edge from 
vertex 0 to vertex i of weight —b;. 


It is to be observed that the subdigraph of D*(—b, A) induced on 
the set of vertices {1,2,...,n} is just the Coates digraph D*(A) 
of the coefficient matrix A. It is also to be observed that there are 
no directed edges entering vertex 0. 


Writing, as we have, the system Az = b in the equivalent way 
—b+ Ax = 0 and forming the n by n + 1 augmented matrix as 


—b) G11 Gig + Ani 
—b a a ... An 

oaj] . © 7 ale (6.9) 
—bn Ani Gn2 °*** Ann 


we see that we could regard the Coates digraph of Ax = b as 
being constructed from the matrix [—b A], where the vertex 0 
corresponds to the initial column. To make this even more precise, 
we could imagine that an initial row of all zeros has been attached 
to (6.9) to obtain a square matrix of order n + 1; the Coates 
digraph of the linear system Ax = b then becomes the Coates 
digraph of the resulting matrix of order n+1, with vertices labeled 
VE ROTER 


Example 6.3.2 The Coates digraph D*(—b, A) of the system of 
linear equations 


44121 + a12X2 + 41373 +0%4 = bı 

Ox, T 422%2 0x3 a24Lı4 = 0 

a31X1 4 0x3 A33X%3 0x4 = b3 (6.10) 
Ox, + as2£2 + 043%3 + Ayr, = O 


is displayed in Figure 6.1, where as usual edges of weight 0 are not 
shown. 
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a2 


Figure 6.1 


The part of the Coates digraph obtained by deleting from 
D*(—b, A) the vertex 0 and all the edges that leave vertex 0 is 
the Coates digraph D*(A) corresponding to the matrix A. 


We now relate the 1-connections of D*(—b, A) to those of 
D*(A). Let D* = D*(A). Let F = D*li — j| be a 1-connection 
of D* from i to j. If b; 4 0, then D*(—b, A) contains the edge 
from vertex 0 to vertex i of weight —b;. Appending this edge 
to D*|i — j], we obtain a 1-connection F” of D*(—b, A) from 0 
to j. Conversely, a 1-connection F’ of D*(—b, A) from vertex 0 
to vertex j gives, upon deletion of vertex 0 and the unique edge 
leaving it, a 1-connection F of D* from some vertex i to vertex 
j. Because 1-connections of D*(—b, A) can only go from vertex 0 
to some vertex j > 1, we have a one-to-one correspondence be- 
tween the 1-connections of D*(A) and those of D*(—b, A). The 
weights of these two 1-connections F and F” under this one-to-one 
correspondence are related by the formula 


w(F’) = -b;w(F). (6.11) 


We now show how to express the solution of (6.3) in terms of 
its Coates digraph. The key to this is Cramer’s formula and the 
determinant formula given in its definition. 

Assume as before that A is invertible, that is, det A 4 0. Then 
Ax = b has a unique solution and, by Cramer’s formula, this 
solution is 


= det A® © 1 
camer or JWA 


> (-1)7""b; det Aji (i E 1,2,... n). 
j=1 


(6.12) 
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We now apply our definition of a determinant from Section 4.1 
and our formula for the cofactors in terms of the Coates digraph 
as given in (5.9) to each of the determinants given in (6.12). We 
then obtain 


i FINDE) 
l oea TU) 

(6.13) 
for i = 1,2,...,n, where the summation in the numerator extends 
over all of the 1-connections D*(A)|j — il of D*(A) from j to 
i, and the summation in the denominator extends over all linear 
subdigraphs L of D*(A). From our discussion comparing the 1- 
connections of D*(A) to those of D*(—b, A), we can rewrite (6.13) 
as 


Ip, (1) OP" HO) w(D*(—b A)[0 = 1) 


Erreca llw) (6.14) 


ti = 


for i = 1,2,...,n. 


Formula (6.14) is the Coates formula for solving a system of 
n linear equations in n unknowns with an invertible coefficient 
matrix. 


Example 6.3.3 We continue with Example 6.3.2. Figure 6.2 dis- 
plays all linear subgraphs of the digraph corresponding to the ma- 
trix of the system of equations (6.10), while Figure 6.3 displays 
all 1-connections from the vertex 0 to the vertex 3 of the Coates 
digraph in Figure 6.1. 


Using the formula, we get that the value of z3 is 


b341G22044 = b3a11Q42Q24 = bj 31022044 + b1.a31G24042 


011022433044 — 412424043031 + 013031042024 — 41103304224 — 022044413431 
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Figure 6.2 


The potential advantage of using the Coates formula for solv- 
ing a system of linear equations results from the fact that the 
digraph from which we find the solution can be drawn if we are 
acquainted with the structure of the system described by linear 
equations, without the need to write the equations; this happens, 
e.g., in electric circuit theory and control theory. In practice we 
usually do not list the linear subdigraphs and 1-connections, but 
determine directly from the Coates digraph the unknown value 
we are seeking. This requires some experience, because without 
close attention, some 1-connections or linear subgraphs may be 
overlooked. Although no efficient general rule for systematically 
finding the linear subdigraphs and 1-connections is known, the 
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following might help. The linear subdigraphs can be classified ac- 
cording to the number of loops contained in them. In order to 
find 1-connections D*(—b, A)[0 — i] it is necessary to determine 
all paths from the vertex 0 to the vertex i, and these paths can 
be classified according to the vertices that come immediately after 
the vertex 0. 


d) 
2 
0 a 0 a42 a24 
IN, aa — b3 
: :® 7 
1 d) 1 2 
—b _ 
9 b 
0 a31 0 ~ a31 a42 a24 
4 
3 O 3 ! 
a4 


Figure 6.3 


The method we have described is intended for calculation by 
hand. The use of a computer with this method is not recom- 
mended, because the power of a computer can be better exploited 
with other methods that are not suitable for hand calculations. 
The practical usefulness of this method is limited to systems of 
equations with not more than about ten unknowns, with the con- 
dition that the corresponding digraph has a comparatively small 
number of edges. Otherwise, it is not practical to find all linear 
subgraphs and necessary 1-connections in a digraph. 

The use of digraphs in solving a system of linear algebraic 
equations is particularly convenient if the coefficients of the system 
are not numerical values, as in Example 6.3.3. 


6.4. SIGNAL FLOW DIGRAPHS OF LINEAR SYSTEMS 127 


6.4 Signal Flow Digraphs of Linear 
Systems 


In this section we discuss a method of Mason for solving certain 
systems of n linear equations in n+ 1 unknowns 7%, £1, %2,..-,%n, 
where zo is distinguished as a parameter, with the resulting solu- 
tion expressed in terms of £o. 

The system of linear equations is assumed to be of the form 


Ti = Qioto T Q111 T `t T Ann; 
T2 = A20%9 T A211 T+ ++ T Amn 

(6.15) 
In = Anoto + Ani Ly ++ annin. 


The coefficient matrix 


Qio Qil Q12 °'" Qin 

Q20 Q21 Q2 *** Aan 
B= . 5 

Ano Ani An2 *** Ann 


is an n by n+ 1 matrix. We will also want to consider the initial 
column 


aio 
A=| |, 
Qno 
of B, and the square matrix 
Qil Q12 *** Gin 
K= =: we En 
Ani An2 *** Ann 


of order n obtained from B by deleting its initial column. 
As with the augmented matrix of a linear system as discussed 
in Section 6.2, we can imagine that the matrix B has been enlarged 
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by adding an initial row of all zeros, thereby obtaining a square 
matrix of order n+1. Then we consider the Coates digraph D*(B) 
with n+1 vertices denoted by £o, £1, %2,...,%, with an edge from 
vertex x; to vertex x; if and only if b,; # 0. In D*(B), there are no 
edges that enter vertex zo (as vertex zo corresponds to the initial 
row of zeros we imagined). The digraph D*(B) obtained in this 
way is called the signal flow digraph, or Mason’s digraph, of the 
system (6.15). 


T2 


Figure 6.4 


Example 6.4.1 The signal flow digraph corresponding to the sys- 
tem of equations 


%ı = Aıoto T Q111 
Tq = QAoogXo + Aq1X1 + A2313, 
3 = QA31%ı + Q32%2, 


is given in Figure 6.4. 


As usual, we define the weight of paths and cycles to be the 
product of the weight of their edges. 


Definition 6.4.2 Let the directed cycles of the weighted digraph 
D*(A) be enumerated by C1, C2,...,Cj,... with weights, respec- 
tively, tı,ta,...,tj,.... Let the paths from vertex xo to vertex X; 
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be enumerated by Pi, Pj,...,P,... (i = 1,2,...,n). We define 
wf to be the weight of Pi (i = 1,2,... n; j > 1). 

Mason’s determinant Au = Am(D*(B)) of the weighted di- 
graph D*(B) is defined by 


/ / / 
Av =1-—Slt+S tit; — So titjte+---, (6.16) 
i tj 


i,j,k 


where >>‘ t; is the sum of the weights of all cycles in the digraph, 
Di; tit; is the sum of the products of the weights of all pairs of 
cycles having no common vertex, X; jg titjtk is the sum of the 
products of the weights of all triples of nontouching cycles, that 
is, cycles no two of which have a common vertex, and so forth. 


Definition 6.4.3 Let G be a weighted digraph with n vertices. 
Then the Coates determinant Ac = Ac(G) of G is defined by 


Ac = (1) w(Z), (6.17) 


L 


where the summation extends over all linear subdigraphs of the 
digraph G. Note that in the case that G is the Coates digraph 
D*(A) of a matrix A of order n, then Ac(D*(A)) = (—1)" det A. 


The system of equations (6.15) can be written in the form 
x = 20Ao + Ax, equivalently (A — In)£ = —xp Ao. (6.18) 


Regarding zo as fixed, the augmented matrix of (6.18) is the n by 
n + 1 matrix 
[£0Ao A- In] ; 


Thus the Coates digraph D*(£x0oAo, A) of the system (6.18) is just 
the Mason digraph with the weights changed as follows: the weight 
of each edge leaving the vertex labeled 29 is obtained by multipling 
by zo, and the weight of the loops at vertices labeled £1, £2,..., En 
is obtained by subtracting 1. Note that if there wasn’t a loop at 
one of the vertices x;, (1 < i < n) (that is, a; = 0), then a loop 
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of weight —1 is inserted; if there was a loop of weight 1 (that is, 
aii = 1), then the loop disappears. 

Our goal is to show that the solution of the system of linear 
equations (6.15) is given by the following formula, known as Ma- 
son’s formula: 


(i) q (i) 
_ u A; 


Re fo: At Lake (6.19) 


Ti 


where the summation extends over all the paths from x to x; in 
D*(B) as enumerated in Definition 6.4.2. In this formula, 
(i) A; 


0 denotes Mason’s determinant of the subdigraph of 
D*(B) obtained by deleting all vertices of the path P$. 


(ii) Ay is the (ordinary) determinant of the matrix A — I, of 
order n, where A, as previously explained, is obtained from 
B by deleting its initial column. 


a 
4 b dk ay c N 
Xo T2 
f e 
ý y 


Figure 6.5 
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Example 6.4.4 Consider the system of linear equations 


Ti = bag + dx, cir fts 


% = aX% + CT1 


t3 = EX, T GX3, 


whose corresponding signal flow digraph is displayed in Figure 6.5. 
Using formula (6.19) we get 


a(l — (d+ef+g)+ gd) + bc(1 — g) 
1-(d+ef+g)+gd 


Ty = To. 


The introduction of the distinguished variable x, is not math- 
ematically necessary but it is useful in applications of signal flow 
graphs to control theory (see Section 10.1). 

Not surprisingly, Mason’s formula is derived with the help of 
the Coates formula. 

Using Mason’s formula as a model, we can rewrite the Coates 
formula (6.14) for the solution of a linear system of n equations in 
n unknowns as 


Dr aP AP (D*(—b, A)r 
ADA 


where q® denotes the weight of the kth path P/ from the vertex 
0 to the vertex i and D*(—b, A); is the subdigraph of the Coates 
digraph obtained by deleting the vertices of this kth path. 

Consider the digraph D*(A — J,,) that arises from the digraph 
D*(A) by subtracting 1 from the weight of every loop of D*(A); 
as before, if there is no loop at a vertex, then the vertex gets a 
loop with the weight —1. We first verify the formula 


Ti = 


(6.20) 


Ac(D*(A — In)) = Au (D*(A)). (6.21) 
To see this, first note that 


Ac(D*(A — 1,)) = (-1)" det(A — In). 
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Setting A = 1 in the formula given in Corollary 4.3.2, we get that 


(-1)” det(A — I.) = (-1)” |: + YOS d 4, , (6.22) 
p=1 Ap 


where the second sum is taken over all principal submatrices A, 
of A of order p. Now, applying the definition of the determinant 
to the quantities det A,, we see that (6.22) gives Au(D*(A)) as 
defined in (6.16). 

Now consider the system of equations (6.15) written in the 
form 


(A = dL sya = —2XpApo 


as given in (6.18). Solving this system using (6.20), and using 
(6.21), we get 


Zuge a -a Ay) 


= nn 1%. 


Here qË ) denotes the weight of the kth path from the vertex 0 
to the vertex j and D*(x9Ao, A), is the subdigraph of the Coates 
digraph obtained by deleting the vertices of the kth path. 

It is remarkable that the graphs introduced for treating sys- 
tems of linear algebraic equations (and, especially, Mason’s signal 
flow graph) give a better insight into the physical system under 
description than the corresponding system of equations does. His- 
torically, these graphs were introduced and used intuitively, the 
theoretical background of them having been given later. See Sec- 
tion 10.1 for examples of using these techniques in electrical circuit 
and control theory. 
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6.5 Sparse Matrices 


In this section we shall describe some specific features for treating 
systems of equations with a sparse matrix in which the entries are 
given numerically. Often problems in electrical engineering, and 
in engineering and science in general, lead to systems of linear 
equations whose matrix is sparse with entries given numerically. 
To a great extent, special methods of treating such matrices use 
graph-theoretical means [4], [11], [76], [29]. 

There is no strict quantitative criterion that determines when 
a matrix should be considered as a sparse matrix. Special proce- 
dures for treating sparse matrices include the execution of some 
additional operations, and therefore they are effective only if a 
matrix contains a sufficient number of “well-placed” entries equal 
to zero. Sometimes it is obvious that special techniques are not 
efficient. For example, if a square matrix of order 100 contains 10 
zeros, it is clearly best to ignore the zeros and to solve the system 
by standard techniques. However, if a matrix of order 1000 con- 
tains only 4000 entries different from zero, it may be advantageous 
to specify the matrix by the value and the position of its nonzero 
entries and deal with the matrix by special methods. Roughly 
speaking, we should consider a matrix to be sparse if, indepen- 
dent of its order, each of its rows and columns contain only “a 
few” nonzero elements. 

As indicated, a sparse matrix is stored in a computer by storing 
only its nonzero entries along with its row and column index. The 
König digraph G(A) and the digraph D(A) play an important role 
in sparse matrix techniques applied to A. 

When dealing with systems of equations with a sparse matrix 
one would first try to split the system into subsystems, then solve 
each of the subsystems, and finally get the solution of the whole 
system from the solutions of the subsystems. 

As already indicated in Section 6.1, when describing the reduc- 
tion of a matrix to the reduced row-echelon form, permutation of 
equations and the permutation of unknowns in the equations play 
an important role. Permutations of rows and columns of the ma- 
trix A correspond, respectively, to permutations of equations and 
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unknowns. Contrary to the permutations of rows and columns, 
which will be used in Chapter 8, it is allowed here to apply differ- 
ent permutations to rows and columns. In general, the matrix A 
is transformed by two permutation matrices P and Q, such that 
the new matrix has the form A’ = PAQ. The digraph G(PAQ) is 
obtained from the digraph G(A) by independent permutations of 
the labels of its black vertices (action of the matrix P) and of the 
labels of its white vertices labels (action of the matrix Q). 

Consider a system Ar = b, where A is a square matrix. We 
assume that A is nonsingular so that Ax = b has a unique solution. 
Since A is nonsingular, det A # 0 and hence G(A) has at least 
one 1-factor F. We may permute the columns of A so that each 
edge of the 1-factor joins a black and white vertex with the same 
label. As a result, we get the system A’x = b, where each of the 
entries on the diagonal of the matrix A’ is nonzero. As described in 
Section 8.1, the digraph D(A’) has a number m (possibly m = 1) 
of strong components. By properly relabeling rows and columns 
of A’ (applying the same permutation to the rows and columns of 
A’), the matrix A’ takes the following block-triangular form: 


A O > O 
pe ee oe Ik (6.23) 
E E 
where A11, A22, ..., Amm, are square blocks and all the entries 


above these blocks equal 0. The blocks Akk correspond to the 
strong components of A’. (This is the Frobenius normal form of 
A’—actually in transposed form—as described in Section 8.1.) 

Let the vectors x and b from the system A’x = b be repre- 
sented in the form zT = [xı £2 ... Lm], 67 = [by bə ... bm], where 
£1, £2, . . -, Em and b1,ba,..., bm are vectors of dimensions that cor- 
respond to the block sizes in (6.23). Then, solving the system 
A'x = b is equivalent to solving the following subsystems: 


k—1 
Åkktk = bk = 5 Aus k= 1, 2, e M, (6.24) 


j=l 
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where we solve first the system (6.24) for k = 1, that is, Ayızı = b, 
and then solve, in order, the systems for k = 2,...,m. Of course, 
if m = 1, we only have the original system Ax = b. 

The digraphs D(A) and D(A’) are not in general isomorphic. 
The digraph D(A’) depends on which 1-factor F of G(A) was 
chosen. But what is true is that the number m of blocks in (6.23) 
does not depend on which 1-factor F was chosen, and the blocks 
Apk are uniquely determined up to row and column permutations. 
This follows from the following observations: Similar to the proof 
of Theorem 4.2.10, no linear subdigraph of D(A’) contains edges 
corresponding to entries of off-diagonal blocks of A’. Therefore, for 
any splitting of the system into subsystems (with nonzero diagonal 
entries) each 1-factor of the König digraph G(A’) is a (disjoint) 
union of 1-factors of the Konig digraphs of the diagonal blocks, 
and therefore it does not matter which 1-factor we chose at the 
beginning. 

Keeping in mind the above considerations, it is useful to have 
an algorithm for finding strong components of a digraph. Such an 
algorithm is given in Section 3.7 of [7]. For graph algorithms in 
general, one may consult see [31], [54], [39]. 

We are now left with the problem of solving a system that 
cannot be further split into smaller subsystems as described above. 
Thus consider a system Ax = b, where A is a sparse matrix, and 
the system cannot be split into the subsystems. We again apply 
the procedure for finding a reduced row-echelon form from Section 
6.1 (Gaussian elimination), but its usage now has a number of 
special features. 

In order to avoid numerical difficulties when dividing by a num- 
ber close to zero, it is usually required in working with general 
matrices that the pivot element in each step is of maximal mod- 
ulus. In sparse matrices it is required that the pivot is greater 
in modulus than a minimal value that is recommended in the lit- 
erature for several types of problems. We shall assume that this 
condition is always fulfilled when making a choice of the pivoting 
element. This additional freedom in choosing the pivoting element 
enables the reduction in the number of nonzero entries by which 
the matrix is filled when applying EROs of type III. 
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Example 6.5.1 If we apply type III EROs on the system with 


the matrix 
2 3 45 


—3 


=2 
2 


See eee 
N 


the lower right 4 x 4 submatrix will be completely filled (with 
nonzero numbers) already in the first step of the process. However, 
by permuting rows and columns we get the matrix 


Now the process does not lead to the appearance of new nonzero 
entries. 


Keeping in mind the way of storing a sparse matrix in the 
memory of a computer and the fact that arithmetical operations 
are performed, only with nonzero entries, it is clear that the ap- 
pearance of new nonzero elements leads to larger occupation of 
the memory and a longer running time of the program. It can 
also cause an interruption of the program execution if the space in 
memory is exhausted. Therefore the minimization of the number 
of new nonzero entries—the so-called fill—which appear in the 
process, is one of the central questions in the work with sparse 
matrices. 

Consider the König digraph G(A) of the sparse matrix A. The 
weight of the arc between the black vertex 2 and the white vertex j 
is a;;. Let d7 be the outdegree ofi and d; the indegree of j. If aj; 
is chosen for the pivot element, we must produce a zero at d; — 1 
places in the j-th column, i.e., the ith row is added d; — 1 times 
to other rows (previously multiplied by an appropriate number). 
With each such addition we add d7 — 1 entries that are different 
from zero. In total, we add (dj — 1) (d7 — 1) nonzero entries at this 
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step. Since the matrix is sparse, as many as this number of new 
nonzero entries will be added to zero entries, and thus as many 
as (df — 1)(d; — 1) new nonzero entries may appear in this step. 
Hence, the pivot entry is often determined by the edge (i, 7) of the 
digraph G(A) for which (df — 1)(d; — 1) is minimal. Of course, 
in any step of the process we consider a vertex deleted subgraph 
of the original digraph G(A). 


6.6 Exercises 


1. 


Use EROs to find all solutions of the following system of 
linear equations: 


v1 2X9 + I = 3 
32] = 3X9 = 2x3 = 6 
8 


521 = 3X9 = 


. Use EROs to find all solutions of the system of equations 


Ax = b, where 
[ ° 1 1 3 | 10] 
A=|13 —2 0| andb=|3|. 
Ex = 4 | | is | 


. Use EROs to find the inverse of the matrix 


2 3 1 
1 02 
1 —1 2 


. Find bases of the row space, column space, and null space 


of the matrix 


—1 —2 1 10 
1 2-2 —4 3 
Pe ve Oe. 224 
1 Z sear 2 
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10. 
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. Use Cramer’s rule to find the solution of the following system 


of linear equations: 


Uy 2X9 323 = 


321 0X2 8x3 
4%, + 5z + 10x3 = 1 


. Use Cramer’s rule to find the solution of Ax = b, where 


1 2 3 s] ee 
412 8 0 
A= 3412 and b = 1 
23 4 1 0 


. Solve the system of equations given in Example 6.3.2. 


. Draw the Coates digraph corresponding to the linear system 


azz + bx3 =A 
cx, + dx +erı = B 
fay +9%3+hr4a = 0 
UL + VT3 = 0. 


Under the assumption that acvh + bfeu — bcuh — a fve # 0, 
find its solution. 


. Using (6.24), solve the system of equations Ax = b, where 


Pwr bv 
oF eK he 
Pr Oe 
oe OO 


Solve the system of equations given in Example 10.1.1 in 
Section 10.1. 


Chapter 7 


Spectrum of a Matrix 


In this chapter we introduce the fundamental concepts of eigen- 
values and eigenvectors of a square matrix in the classical way. 
The eigenvalues of a matrix A of order n are roots of a polyno- 
mial, called the characteristic polynomial of A. The coefficients 
of this polynomial are sums of certain determinants of submatri- 
ces of A and thus can be described using digraphs as shown in 
Section 7.1. In Section 7.2, we give a combinatorial argument for 
the Cayley-Hamilton theorem, which asserts that a matrix satis- 
fies its characteristic polynomial. The study of eigenvalues leads 
to the notion of similarity of matrices and this, in turn, leads to 
the Jordan Canonical Form of a matrix in Section 7.3. We give 
a highly combinatorial argument for the existence of the Jordan 
Canonical Form of a matrix. The chapter is concluded with Sec- 
tion 7.4 which describes how eigenvalues of circulants, introduced 
in Chapter 3, can be calculated using associated digraphs. 


7.1 Ejigenvectors and Eigenvalues 


We begin with an important definition. 
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Definition 7.1.1 Let 


Qili Q12 `t Gin 

Q21 Q22 *** Gan 
A= : 

Ani An2 ‘`’ Ann 


be a matrix of order n. Let A be a real or complex number. Then 
A is an eigenvalue of A provided there is a nonzero column vector 


in R” or C” such that Au = Au. If the eigenvalue A is a real 
number, then there is a real vector u. However, if A is a com- 
plex number, which may happen even if A is real, the vector may 
be a complex vector. The nonzero column vector u is called an 
eigenvector of A corresponding to its eigenvalue A. The eigenvalue- 
eigenvector matrix equation Au = Au can be rewritten as 


(Al, — Aju = 0. 


Because eigenvectors of a real matrix may be nonreal, we gen- 
erally take our eigenvectors in C”. Let u be an eigenvector of A 
corresponding to the eigenvalue A. Since u is a nonzero vector, 
the equation (AI, — A)u = 0 implies that A is an eigenvalue of A 
if and only if AJ, — A is a singular matrix; equivalently, À is an 
eigenvalue of A if and only if 


det(AI, — A) = 0. (71 


In particular, 0 is an eigenvalue of A if and only if det A = 0, that 
is, if and only if A is a singular matrix. 
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u 


A simple computation shows that 


Example 7.1.2 Let 


det(AIn — A) = X -3I-4=(A-AAH+1), 


which equals zero if and only if A = 4 or — 1. Thus A has two 
eigenvalues, namely, 4 and —1. We then have 


3 —2 
baa] 


To find an eigenvector of A corresponding to its eigenvalue 4, we 
need to solve the homogeneous system of two linear equations 


321 = 2X2 = 

—3271 + 209 = 
One solution is xı = 2 and zy = 3. Thus u = [2 3]” (and any 
nonzero multiple of this vector) is an eigenvector of A correspond- 
ing to A = 4. A similar computation shows that u = [1 - 1]? 


(and any nonzero multiple of this vector) is an eigenvector of A 
corresponding to À = —1. 


Now let 
1 2 
a=| 4 i 


det(AJg — A) = àX? +1 = (A-i)(A +å). 


Hence the eigenvalues of A are i. 


Then 


It follows from the definition of the determinant given in Chap- 
ter 4 that if A is a matrix of order n, then p4(A) = det(A/„ — A) 
is a polynomial in A of degree n, called the characteristic polyno- 
mial of A. Since pa(A) is a polynomial of degree n, and since 


‘An eigenvalue is also called a characteristic value and an eigenvector is 
also called a characteristic vector. 
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a polynomial of degree n has n roots (possibly complex numbers 
even if A is a real matrix), counting multiplicities, the matrix A 
has n eigenvalues Aı,Aa,...,An. These n eigenvalues comprise the 
spectrum of A. 


Example 7.1.3 Let 


ooo bl 
SPS OU. SS 
oN OQ 
oa oo do 


Then pa(A) = det(AI4— A) = (A—2)?(A—5)?. Thus the eigenvalues 
of A are 2,2,5,5. More generally, the eigenvalues of a diagonal 
matrix, indeed a triangular matrix, of order n are its n diagonal 
entries. 


From Corollary 4.3.2 we obtain that the characteristic polyno- 
mial of a matrix A of order n is given by 


n 


det (AL, — A) = I (1 Pen_pr?, (7.2) 
p=0 
where 
Cn—p = 5 det Alji, j2,- -, Ín-p h {j1 J2; <- Jn-p th 


1<jı<ja<- -<jn-p<n 


the sum of the determinants of all the principal submatrices of A 
of order n — p. In particular, we have that the coefficient of \"~! 
is 
Cy = G1 + a22 + +++ + ann = trA, 
and 
Cn = det A. 
Example 7.1.4 Let A = [a;,;] be a square matrix of order 3. Then 


À — aı — a22 —413 
Als SAS —431 À — a22 — 423 
—431 —a32 A — a33 
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Thus the characteristic polynomial of A is 
pa(A) = A? — (trA)A? + cod — det A, 
where 
C2 = (11422 — Q12921) + (411433 — 413031) + (Q22023 — 423032), 
a sum that can be rewritten as 
Co = det A[{1, 2}, {1, 2}]+det A[{1, 3}, {1, 3}]+det A[{2, 3}, {2, 3}]. 


Thus ca is the sum of the determinants of the three principal sub- 
matrices of A of order 2. 


Example 7.1.5 Let 


1 1 1 1 
1 1 1 1 
E ae aie 
1 1 1 1 


Since the determinant of a matrix of order at least 2, each of whose 
entries equals 1 is 0, computing the characteristic polynomial of 
A using formula (7.2), we get that cı = 4, co = c3 = c4 = 0, and 
hence 


pala) = àf — 4) = (A — 4). 


Hence the eigenvalues of A are 4,0,0,0. More generally, the eigen- 
values of the matrix of all 1’s of order n are n,0,...,0 where there 
are n — 1 0's. 


Example 7.1.6 Let a matrix A of order n be the direct sum of 
square matrices, say, 


A= A; @ Ao ® A3 
of orders n1, n2, and ng, respectively. Then 


Mo. = A= Ol SAN OU L345) OO = As), 
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and hence 
det (AI, — A) = det(AI,, — Ai) det(Aln, — A2) det(Aln, — As), 


and the characteristic polynomial of A is the product of the char- 
acteristic polynomials of A1, A2, and A3. Thus the spectrum of A 
is obtained by putting the spectra of A1, A2, and A3 together. 


Let A be a matrix of order n with eigenvalues 1, A2,...,An- 
Because the eigenvalues are the n roots of the characteristic poly- 
nomial p4(A), which has leading coefficient 1, we have 


Pa) = (A= Ar)(A- 22) ++ (A= An) 
a pcp AO ee (TN ey: 


Comparing with the coefficients of the characteristic polynomial 
as given in (7.2), we see that the trace of A is the sum of its n 
eigenvalues: 

cı = trA=A,;+A9+---+A”n, 


and the determinant of A is the product of these eigenvalues: 
Cn = Ay A2 ar <An- 


Because AI„— AT = (AI„— A)T, and a matrix and its transpose 
have the same determinant, 


par (à) = det(AI„— AT) = det((AI„—A)T) = det(AI,—A) = pa(A), 


that is, A and AT have the same characteristic polynomial and 
hence the same eigenvalues. 


Definition 7.1.7 Let A = [a,,;| be a matrix of order n and let X be 
an eigenvalue of A. The algebraic multiplicity of A is its multiplicity 
as a root of the characteristic polynomial det(A/„ — A). Thus the 
multiplicity of an eigenvalue of A is an integer between 1 and n. 
The eigenspace of A corresponding to A is defined to be 


VA) = {x E€ C” : (AL, — A)z = 0}. 
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The eigenspace of A corresponding to X is the null space of the 
matrix Al„— A, and so is a subspace of C”. (If all n eigenvalues of A 
are real numbers, then we can take the eigenspace to be a subspace 
of R”. But if A has a complex eigenvalue, then the eigenspaces are 
taken to be subspaces of C".) The eigenspace consists of the zero 
vector and all the eigenvectors of A corresponding to A, and thus 
has dimension at least 1. The geometric multiplicity of is the 
dimension of the eigenspace V\(A) and thus equals n —r(AJ;, — A). 
O 


Example 7.1.8 First consider the identity matrix /„. Each of 
its n eigenvalues equals 1, and the eigenspace V, (/„) is all of R”. 
Thus both the algebraic and geometric multiplicities of 1 equal 
n. More generally, if D is a diagonal matrix of order n with di- 
agonal entries d1, d2,..., dn, then its eigenvalues are dj, do,..., dn 
and the geometric multiplicity of an eigenvalue equals its algebraic 
multiplicity. For example, let 


400 00 
040 0 0 
D=|00700 
00070 
00007 


Then the eigenvalues of D are 4, 4, 7,7,7, so that 4 is an eigenvalue 
with algebraic multiplicity 2 and 7 is an eigenvalue with algebraic 
multiplicity 3. We then have 


A= Ds 


aeae ee er a) 
DEO SDT O OO 
| 
oowo Co 


a matrix of rank 3. Hence the dimension of the null space of 
4I; — D, that is, the geometric multiplicity of 4, equals 2. In a 
similar way, we see that the geometric multiplicity of 7 is 3. 
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Now consider the matrix T, that has 1’s on and above the main 
diagonal. For instance, if n = 4, then 


1 1 1 
Ty = 


See ee 


0 1 
0 0 
0 0 


Orr 


The characteristic polynomial of T, is pr,(A) = (A — 1)” and so 
1 is an eigenvalue of T, with algebraic multiplicity equal to n. 
The eigenspace V;(T;,) consists of all vectors u = [u1 u2 ... Un)” 
such that (J, — T,)u = 0. The matrix I„ — Tn clearly has rank 
equal to n — 1 and hence the dimension of its null space equals 1. 
Thus the geometric multiplicity of 1 equals 1. We conclude from 
this example that the algebraic and geometric multiplicities of an 
eigenvalue need not be equal and indeed may be quite different. 
These facts play an important role later in this chapter. 


We conclude this section by showing that each of the eigenval- 
ues of a real symmetric are real numbers. 


Theorem 7.1.9 Let A = [a,;| be a real symmetric of order n. 
Then each of the n eigenvalues of A is a real number. 


Proof. We use the dot product of C” as reviewed in Section 
1.5. Let A be any eigenvalue of A, and let x be an eigenvector (a 
nonzero vector in C”) for A: 


Ar = Xx: (7.3) 


For a matrix B, let B" denote the matrix obtained from B by 
replacing each of its entries by its complex conjugate and trans- 
posing (or, equivalently, transposing and then conjugating each 
entry). We have (BO)? = CĦB¥, since (BC)? = CTBT, and 
a +b = T+) and ab =@b. Since A is real and symmetric, A” = A. 
Multiplying (7.3) by z” we get s" Ax = Ar" x. We also have from 
(7.3) that 


pt Are A)x = (Ax) x = (Ax) x = àz” z. 


Thus Ar" x = Ax" x. Because x is not a zero vector, xx 4 0, and 
hence À = À and A is a real number. 
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7.2 The Cayley-Hamilton Theorem 


The Cayley-Hamilton theorem asserts the rather surprising result 
that a square matrix satisfies its characteristic polynomial. This is 
a very important theorem with theoretical consequences and with 
applications in physics and engineering disciplines. Before giving a 
precise statement of the Cayley-Hamilton theorem, we introduce 
some concepts that we will use in its verification. 


Definition 7.2.1 Let D be a digraph with vertices 1,2,...,n, 
and let i and j be vertices of D. We recall that a 1-connection 
Dli — j| of vertex i to vertex j is a spanning subdigraph of D 
consisting of a path from i to j and a possibly empty collection of 
pairwise vertex disjoint cycles having no vertex in common with 
the path. The total number of edges in a 1-connection equals 
n—1. A quasi-l-connection D|i — j]* from i to j is defined like 
a 1-connection except that the path from i to j is replaced with a 
walk from i to 7 of length at most n where the walk may intersect 
the cycles and where the total number of edges in the walk and 
cycles is to equal n.? Thus a quasi-1-connection Dfi — j]* is a pair 
consisting of a walk y from i to j and a possibly empty collection C 
of pairwise vertex-disjoint cycles, where the total number of edges 
in the walk and cycles equals n. The weight w(Dii — j]*) of a 
quasi-l-connection D|i — j]* is the product of the weights of all 
its edges. The number of cycles in the quasi-1-connection is the 
number of cycles in C, and this number is denoted by c(Dli — j]*). 


Theorem 7.2.2 (Cayley-Hamilton theorem) Let A = [aij] be a 
matrix of order n, and let 


PA) = A? = G APTI en PS (eek (a 
be the characteristic polynomial of A. Then p(A) = O, that is, 


A” — AS Hono (1) cn A® te + (1) al = O. 
(7.4) 


?Unlike a 1-connection, the total number of edges in a quasi-1-connection 
is not determined by its walk and cycles. 
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Before proving this theorem we give an example. 


Example 7.2.3 Let 
1 2 0 
Ale 
2 1 1 
A simple calculation shows that 
palà) = à? — 5A? + 8-8. 


We calculate that 


=I g 2 -5 24 10 
A? = | —2 8 4] and A= | —2 24 12 
3 8 2 —1 32 10 


Substituting into the characteristic polynomial of A we obtain 


—5 24 10 —1 8 2 
—2 24 12 |-—5| -2 8 4|+ 
—1 32 10 382 
1 2 0 1 0 0 
8| —1 3 1|—8|/0 1 0|=0 
2:4 0 0 1 


Proof of Theorem 7.2.2. We have to show that each entry of 
the matrix p,(A) as given in (7.4) equals 0. The coefficient c,_;, 
of A" in the characteristic polynomial equals the sum of all the 
determinants of the principal submatrices of A of order n — k. It 
follows from the definition of the determinant given in Chapter 4 
that Cn- equals 


I 


where the summation extends over all linear subdigraphs of the 
Coates digraph D*(A) having n — k vertices and where c(L) is the 
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number of cycles of L. From Chapter 3 we know that the entry in 
position (i, j) of A* equals the sum of the weights of all walks of 
length k from vertex i to vertex j. 
Therefore the entry in position (i, j) of (-1)""*c„_x A" equals 
D (-1)w(DE > jl), 

Dii=J]; 
where the summation extends over all quasi-1-connections D|i — 
j|% whose walk y from i to j has length k. We thus conclude that 
the entry in position (i, j) of p4(A) equals 

> (PHM w(Dli > jl), (7.5) 

Dii>3j\* 

where the summation now extends over all quasi-connections from 
i to j. (Recall that the walk in a quasi-1-connection has length at 
most n.) 

We now show how to pair up the terms’ in (7.5) so that the sum 
of the terms of each pair equals zero. The total number of edges in 
a quasi-l-connection consisting of a walk y and a collection C of 
pairwise vertex-disjoint cycles equals n. Thus either the walk y is 
not a path (and so contains a cycle) or the walk y meets a vertex 
of one of the cycles in C. We proceed along the walk y until we 
first (a) revisit a vertex, or (b) visit a vertex of one of the cycles 
m in C, whichever comes first (note that these two events cannot 
occur simultaneously). In case (a), the walk contains a cycle and 
we remove this cycle from y and add it to C to create a new quasi- 
1-connection with one more cycle but the same weight. In case (b) 
we remove the cycle m from C and add it to the walk y to make it 
larger. In each case the number of cycles changes by 1, so the sign 
in (7.5) changes but the weight of the quasi-1-connection does not 
change. This process is reversible, leading us to conclude that the 
sum in (7.5) equals zero. Hence p,4(A) = O, as was to be proved. 
Oo 


We have given a proof of the classical Cayley-Hamilton the- 
orem that illustrates that it is really a theorem about weighted 
digraphs; this was first noted by Rutherford [70] (see also [75], 
[81] and |7]). 


3That is, establish an involution. 
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7.3 Similar Matrices and the JCF 


An m by n matrix A = [a;;] represents the linear transformation 
T from an n-tuple space F” to an m-tuple space F” defined by 


multiplication by A as follows: If x = [£1 £2 ++- £|” is an n-tuple, 
then 
Tı yı 
T y 
Pasal ae 
Tn Ym 


Here the vectors x and y are given in terms of their coordinates 
with respect to the standard (ordered) basis ņn = (e1, €2,..., €n) 
of F” and the standard (ordered) basis 7 = (e1, e%,..., €) 
of F™, respectively. If we choose a different (ordered) basis 
a = (u,u®,...,u™) for F” and a different (ordered) basis 
B = (w®,w®,..., w0) for F™, then, with respect to coordi- 
nates relative to these bases, a different matrix will represent the 
linear transformation T. A vector u in F” can be uniquely repre- 
sented as a linear combination of the vectors in a: 


and has coordinate vector 


Pn 
with respect to the basis a. Similarly, the vector w = T(u) in F™ 
can be uniquely represented as a linear combination of the vectors 


in 8: 
i= (uO see FREE. 
and has coordinate vector 
qı 
q2 
Tu)s=]| . 


dm 
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with respect to the basis 9. Let R = [r;;] be the n by n matrix 
whose column vectors are the coordinate vectors 


u] [uP], e’ [ua 


and let S = [s;;] be the m by m matrix whose column vectors are 
the coordinate vectors 


/ 


[ei ]a; [e2]; - - - » lem]. 


A straightforward calculation shows that 


T(u) = r (Spat) 


4=1 
n n 
= ol re 
i=l j=l 
n n 
= pi), rT (e;) 
i=l j=l 
n n m 
`~ ~ ~ 
= | ae] 
i=1 j=l k=1 
n n m m 
~ ~ 


s 
= 

& 
= 
m 
m 
~ 
= 


= > >> 5 > aan) w), 


Thus 
[T(u)ls = SAR[ula. 


In the special case that n = m and a = p, the matrix S equals 
R-t, and hence we get 


[T(u)]a = RAR[ula. 


Hence, the matrix of the linear transformation T : R” — R” with 
respect to a basis a of R? equals R"'AR, where A is the matrix of 
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T with respect to the standard basis n of R”. This motivates the 
following definition. 


Definition 7.3.1 Let A and B be square matrices of the same 
order n. Then B is similar to A provided there is an invertible 
matrix X such that B = X~-'AX. If B is similar to A, then we 
write B ~ A. 


The relation of similarity on matrices of order n satisfies three 
important properties: 


(R) reflexive property: B ~ B for all B: This is because we may 
take X = In in the definition of similarity. 


(S) symmetric property: If B ~ A, then also A ~ B: If B~ A, 
then B = X~!AX for some invertible matrix X, and then 
A = XBX™ HX) AG) = BY where Y = 
XT, 


The symmetry property implies that we may simply say that 
matrices A and B are similar rather than say that A is similar 
to B. 


(T) transitive property: If A ~ Band B ~ C, then A ~ C. 


If A~ Band B ~ C, then A = XBX and B=Y-!CY 
for some invertible matrices X and Y. Hence 


Je "Y CYX = (YX) COL = ZOZ, 
where Z= YX. 


The properties of reflexive, symmetric, and transitive are the three 
properties defining an equivalence relation. An equivalence rela- 
tion on a set always partitions the set into equivalence classes 
whereby two elements in the same class are equivalent and two 
elements in different classes are not equivalent. Thus similarity 
defines an equivalence relation on the set of square matrices of 
order n and partitions the matrices into similarity classes, that is, 
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classes such that matrices in the same class are similar and those 
in different classes are not. 

In the next theorem we collect several elementary properties 
of similar matrices. 


Theorem 7.3.2 Let A and B be similar matrices of order n. 
Then the following properties hold: 


(i) det A = det B. 
(ii) The rank of A equals the rank of B. 


(iii) A and B have the same characteristic polynomial and the 
same spectrum. Thus the algebraic multiplicity of eigenval- 
ues is the same for A and for B. 


(iv) Let B be similar to A with B = X~'AX, and let X be 
an eigenvalue of A with corresponding eigenvector u. Then 
Xu is an eigenvector of B corresponding to its eigenvalue 
A. Likewise, if v is an eigenvector of B corresponding to 
eigenvalue A, then Xv is an eigenvector of A corresponding 
to eigenvalue A. Thus 


V\(B) = {Xu DWE Vy(A)}, 


and the geometric multiplicities of the eigenvalues are the 
same for A and for B. 


Proof. Assume that B = X~!AX. Then 


det B = det(X~'AX) = det X~' det A det X 
= (det X)~'det A det X = det A. 


Thus (i) holds. Assertion (ii) follows from the fact that multiplying 
a matrix by an invertible matrix, thus by a product of elementary 
matrices, does not change its rank. For (iii) we calculate that 


det(AL, — B) = det(AI, — X~'AX) = det({X "(A — A)X) 
det X"! det(AI,, — A) det X = det (AI, — A). 
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For property (iv) we simply calculate that 
B(X u) = (XTAX)X tu) = Xt Au = X7* Au = A(X ~ 'u) 


and note that since u is a nonzero vector, so is X~!u. Let À be 
an eigenvalue of A, and thus for B, and let Bu = Av. A similar 
calculation can be made with A and Xv. Thus V,(B) = {X7!u: 
u € Vı(A)}. Because X"! is a nonsingular matrix, it follows that 
V)(A) and Vx(B) have the same dimension. 


Example 7.3.3 The conditions (i)—(iii) do not guarantee that the 
matrices A and B are similar. For example, let 


A=h=|4 d and B= | ¢ ae 


Then A and B have determinant equal to 1, have rank equal to 
2, and have characteristic polynomial equal to (A — 1)?. But A 
and B are not similar since the only matrix similar to I is Jy (in 
general, X~1J,,X = In). The geometric multiplicity of 1 is 2 for A 
and 1 for B. 

If we also assume that the geometric multiplicities of the cor- 
responding eigenvalues of A and B are equal, we still cannot con- 
clude that A and B are similar. For example, let 


0 
and B = 


D> 

| 
oooo 
oooo 
oooo 
ooro 
oroo 


0 
0 
1 
0 


oo aS — 
o-oo 


Then 0 is an eigenvalue of A and of B with algebraic multiplicity 
4 and geometric multiplicity 2. By calculation we see that A? = 
O but B? # O. But if B = Xx Ta then B? =X Ar xX = 
X-IOX =O. So B is not similar to A. 


Example 7.3.4 Let A and B be two square matrices of the same 
order n. Suppose that the digraphs D(A) and D(B) are isomor- 
phic. Then there is a permutation matrix P such that B = PTAP. 
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Since the inverse of a permutation matrix equals its transpose, we 
can rewrite this equation as B = P~'AP, and hence A and B 
are similar via a permutation matrix. The converse also holds. 
Thus the isomorphism of digraphs is equivalent to similarity via 
a permutation matrix. A similar conclusion holds if we consider 
the Coates digraphs D*(A) and D*(B). On the other hand, if 
the König digraphs G(A) and G(B) are isomorphic, the matrices 
A and B are connected by the relation A = QBP, where P and 
Q are permutation matrices. Here the permutation matrix Q per- 
mutes the black vertices of the digraph G(B) and the permutation 
matrix P independently permutes the white vertices. If the ma- 
trices A and B are similar, requiring Q = P~', then the digraphs 
G(A) and G(B) are isomorphic where the isomorphism y preserves 
the initial pairing of the black and white vertices. This means that 
if y takes the black vertex i of G(B) to the black vertex j of G(A), 
then y takes the white vertex i of G(B) to the white vertex j of 
G(A). Hence, the digraph G(A) can be obtained from the digraph 
G(B) by a permutation of the pairs of black and white vertices. 
O 


Definition 7.3.5 Let A be a square matrix of order n. Then an 
elementary similarity of A is a matrix B = E~'AE, where E is 
an elementary matrix. Since there are three types of elementary 
matrices, there are three types of elementary similarities: 


(i) (elementary permutation similarity) 


in which rows 7 and j are switched and columns 7 and j are 
switched (i 4 j). In particular, the (7,7) and (j, j) entries of 
the main diagonal of A are switched and the (i, j) and (j, i) 
entries are switched in this type of elementary similarity. 


(ii) (elementary diagonal similarity) 
Inle DA le DT = 1,(c-1)Al,(1/c-4), 


in which row 2 is multiplied by c and column 27 is multiplied 
by 1/c (c # 0). There is no change in the entries of the main 
diagonal of A in this type of similarity. 
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(iii) (elementary combination similarity) 
Inle-3+5)Alnle-i +) = Inle-i+3)Aln(-e-i+5), 
in which c times row i is added to row j and —c times column 
j is added to column i (i Æ J). 


Since a matrix is invertible if and only if it is a product of ele- 
mentary matrices, we conclude that A is similar to a matrix B if 
and only if B can be obtained from A by a sequence of elementary 
similarities. 


We now consider the question of how simple a matrix we can 
find in each similarity class. Here by simple we mean a matrix B 
for which the structure of the digraph D(B) associated with the 
nonzero off-diagonal entries of B is simple with very few edges (so 
we ignore all loops of the digraph D(A)). Let us denote by D(B) 
the digraph obtained from D(B) by removing all loops. Thus, if 
B is a diagonal matrix, D(B) is a digraph with no edges and thus 
consists of a collection of isolated vertices. This is the simplest 
possible structure but one that cannot always be attained. For 
example, for the matrix 

1 1 
elo 


from Example 7.3.3, the digraph D(B) consists of two vertices and 
an edge from one to the other. The matrix B cannot be similar 
to a diagonal matrix, as that diagonal matrix would have to be 
Iz and this has already been ruled out in Example 7.3.3. Since 
similar matrices have the same spectrum, if a matrix A is similar 
to a diagonal matrix B, then the entries on the main diagonal of 
B are the n eigenvalues of A. 

A matrix is diagonalizable provided it is similar to a diagonal 
matrix. In the next theorem we give a characterization, in terms 
of eigenvectors, of diagonalizable matrices. 


Theorem 7.3.6 Let A be a square matrix of ordern. Then A is 
diagonalizable if and only if A has n linearly independent eigen- 
vectors, that is, there is a basis of C” (or R” if A only has real 
eigenvalues) consisting of eigenvectors of A. 
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Proof. First suppose that u,u®,...,u™ are n linearly in- 
dependent eigenvectors of A with 


Av® = ru (i=1,2,...,n). (7.6) 


Let U be the matrix whose columns are ud, u), ..., u), respec- 
tively. Then U is an invertible matrix and the equations in (7.6) 
can be written as the one matrix equation AU = UA, where A is 
the diagonal matrix 


A` 0 0 
Ò: Ay e Â 
Or 50° 4x8) An 


Since U is invertible, we have U-!'AU = A, and A is similar to a 
diagonal matrix. 

Conversely, if A is similar to the diagonal matrix A, then there 
is an invertible matrix P such that PAP = A and so AP = PA. 
Since P is invertible, we see that the columns of P are n linearly 
independent eigenvectors of A. 


We shall find a simple matrix in each similarity class in steps. 
We first prove a theorem that can be rephrased to say that a 
matrix is similar to a matrix T whose digraph D(T) is acyclic, 
that is, has no cycles; indeed, the vertices can be ordered from top 
to bottom with all edges pointing downward. 


Theorem 7.3.7 The matrix A of order n is similar to an upper 
triangular matrix T. The diagonal entries of T are then eigen- 
values of A, and T can be chosen so that these eigenvalues appear 
on its main diagonal in any specified order Ai, A2,...,Xn- 


Proof. The proof is by induction on n. If n = 1, there is 
nothing to prove as a square matrix of order 1 is upper triangu- 
lar. Let A, be any eigenvalue of A with corresponding eigenvector 
u. Since u is not the zero vector, u can be extended to a basis 
u = u,u®),...,u®. Let U be the matrix whose columns are 
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um ,u2),...,uw™, respectively. Then U is an invertible matrix. 


The equations U-!U = I, and Au = Mu® imply that 


1 
Uy) = : 
0 
and 
U-1AU = | > ‘A, | (7.7) 


where a; is a 1 by n— 1 matrix and A, is a square matrix of order 
n—1. The matrix in (7.7) is similar to A and hence the eigenvalues 


of A; are Aa,...,An. By induction, there is an invertible matrix 
W, of order n — 1 such that Wl>'A,W, = 7), where T, is an 
upper triangular matrix with Aa,...,A„ on its main diagonal in 


this order. Define a partitioned matrix of order n by 


1 ©, 
w-[4 2]. 


The matrix W is invertible with 


cee tl. 20) 
m =|. wit 


Then UW is an invertible matrix, and using block multiplication, 
we get that 


=i al —1 At B 
(UW) A(UW) = WU auw=|% A 


an upper triangular matrix with diagonal entries Aj, A2,...,An- 
Hence the theorem holds by induction. 


Corollary 7.3.8 Let A be a matrix of order n with eigenvalues 
A1,A2,..., An. Let k be a positive integer. Then the eigenvalues of 
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AF are Ab, Ak,...,A®. More generally, if q(x) = a + 1a + 
---+ G&T + co is a polynomial, then the eigenvalues of 

q(A) = BA A, 
are q(A1), q(A2),---;Q(An). In addition, if A is invertible, then the 


eigenvalues of A"! are A] ,Ay "5... ,Ay. 


Proof. By Theorem 7.3.7 there is an invertible matrix Q such 
that A = QTQ, where T is an upper triangular matrix with 
A1,A2,...,An on its main diagonal. Then 


AS S TO 


and hence A* is similar to T*. Since T* is an upper triangular 
matrix whose entries on the main diagonal are A¥, AS,..., AE, the 
eigenvalues of A® are Af, AS,..., AE. More generally, 


q(A) = (QTQ) = QTQ, 


and it follows that the eigenvalues of q( A) are q(Aı), q(Aa),- - -,Q(An)- 
If A is invertible, then 


A™ = (TO) = QTQ, 


where T”! is an upper triangular matrix similar to A71, whose 
entries on the main diagonal are \;*,Az',...,,!. Hence these 
are the n eigenvalues of At. 


Our next goal is to show that the matrix T in Theorem 7.3.7, 
for which D(T) is acyclic, is similar to a matrix J such that the 
digraph D(J) is a collection of vertex-disjoint paths (and so cer- 
tainly acyclic). 


Definition 7.3.9 Let k be a positive integer. A matrix of order 
k of the form 


uili- 0 0] 
Ou- 0 0 
Alus: i: E 
0 0 ul 
0 0 -0 u 
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with w’s on the main diagonal and 1’s on the superdiagonal (the 
diagonal immediately above the main diagonal) is called a Jordan 
block of order k. If k = 1, then Jı(u) is a matrix of order 1 
whose unique entry equals u. The Jordan block J(u) has a 
characteristic polynomial (A — u)" and hence has u as an eigen- 
value of algebraic multiplicity k. The geometric multiplicity of 
u equals the dimension of the eigenspace V,,(J,), and this equals 
k —r(yl, — Jn) =k —(k — 1) =1. Notice that the digraph D(J,) 
is a path with k vertices. 


A matrix that is the direct sum of Jordan blocks, 
J= Ir (Ar) Ð Jka (Az) D= Ja (Ar), 


is called a Jordan matrix. In a Jordan matrix, t may equal 1 (that 
is, J may be a Jordan block), and the scalars A1,Aa,..., A; need 
not be different. The characteristic polynomial of J equals 


(A = ADHA — àa)" -o (A = Ap, 
and hence its eigenvalues are 
Aı (kı times), Aa (ka times), ..., Az (ki times), 


the n scalars that occur on the main diagonal of J. The scalars 
À1, A2,..., A; are not necessarily distinct, so that the algebraic mul- 
tiplicities of the eigenvalues are not necessarily kı,ka,...,kı. Éu 
is one of the numbers A1,Aa,...,A;, then the algebraic multiplic- 
ity of u equals the sum of the orders of the Jordan blocks whose 
diagonal entries equal u. Each Jordan block with diagonal entries 
equal to u contributes 1 to the geometric multiplicity of u, and 
hence the geometric multiplicity of u equals the number of Jor- 
dan blocks containing u on its main diagonal. Finally, we note 
that the digraph D(J) is a collection of vertex-disjoint paths with 
kı,ka,...,k; vertices, respectively. 
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Example 7.3.10 The matrix 


0 
0 
1 
8 


is a Jordan matrix Jı(5) © J3(5) ® Jo(8) © J4(8) of order 10. The 
eigenvalues of J are 5 with algebraic multiplicity 1 + 3 = 4 and 8 
with algebraic multiplicity 2+ 4 = 6. The geometric multiplicity 
of 5 equals 2, the number of Jordan blocks with 5 on their main 
diagonal; the geometric multiplicity of 8 also equals 2, the number 
of Jordan blocks with 8 on the main diagonal. 


Our goal is to show that every matrix is similar to a Jordan 
matrix. By Theorem 7.3.7, a matrix A of order n is similar to an 
upper triangular matrix T where the eigenvalues of A occur on the 
main diagonal of T in any specified order. We now specify that 
equal eigenvalues occur consecutively on the main diagonal of T. 
Suppose that A has at least two different eigenvalues and let u be 
the one that occurs in the initial positions of the main diagonal of 
T. Thus T has the form 


(BE) 


where Tı is an upper triangular matrix each of whose diagonal 
entries equals u and U is an upper triangular matrix none of whose 
diagonal entries equals u. We now show that we may take X = O, 
that is, T is similar to the matrix 


T, | O 
5121, 5 
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by using elementary combination similarities. Consider rows 7 and 
j of T where row i intersects T) and row j intersects U, and the 
elementary similarity T” = I(c- j +i)TI(—c- j +7). Let a be the 
(i,j) entry of T and let the (j,7) entry of T be 0. Then 0 F u, 
and the (i, j) entry of T” equals a + c(@ — u). Thus, by choosing 
c = —a/(0 — u), the (i,j) entry of T’ equals 0. Moreover, since 
Tı and U are upper triangular, T’ differs from T only in those 
positions of row i in columns j,7 +1,...,n and those positions 
of column j in rows 1,2,...,i. It now follows that by a sequence 
of elementary combination similarities, we may make each entry 
of X equal to 0 with no change in Tı and U. We do this row by 
row starting from the last row of T; and working up to the first 
row, and making 0 each entry of the current row beginning with 
its first entry and working to the right to its last entry. 

If the matrix U in (7.8) does not have a constant main diagonal, 
we repeat the above argument on U. Eventually, we obtain that 
our original matrix A is similar to a matrix of the form 


NOn®---®T, 


where each T; is an upper triangular matrix with a constant main 
diagonal, and no two of these constants are equal. The reduction 
of T by similarity to a Jordan matrix is complete once each of the 
T; have been reduced by similarity to a Jordan matrix. 

Thus we may now assume that T is an upper triangular matrix 
of order m, each of whose main diagonal entries equals u. The 
proof is by induction on m. If m = 1, then T = [u] is a Jordan 
block of order 1. Now let m = 2. Then 


= 
ig | 0 p | 
for some scalar a. If a = 0, then T is a direct sum of two Jordan 
blocks of order 1 and so is a Jordan matrix. Suppose that a Æ 0. 


By an elementary diagonal similarity (multiply row 1 by 1/a and 
column 1 by a) we obtain the Jordan matrix 


Eal 
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Now assume that m > 2. The leading principal submatrix T” of T 
of order m—1 is upper triangular with all ws on its main diagonal. 
By the inductive hypothesis, there is an invertible matrix Q of 
order m—1 such that the matrix S’ = Q-'T’Q is a Jordan matrix. 
Let P = Q@ l, an invertible matrix of order m with inverse 
Q! @ L. Then 


SEPATp= ale (7.9) 


If the last column of S is all zeros apart from p at its end, then 
since S’ is a Jordan matrix, S is also a Jordan matrix, and we are 
done. We now assume that there is a nonzero entry in the last 
column of S above its last entry. 

First suppose that there is an entry h Z 0 in the last column 
that is in the same row as an off-diagonal 1 in one of the Jordan 
blocks of S’. In this case, there is an elementary combination 
similarity that replaces h with 0 and otherwise does not change S. 
For example, if 


(7.10) 


x* * DT *¥I/* * 


the h is in the same row as the 1 in column 5. The elementary 
combination similarity that adds —h times column 5 to column 7 
and h times row 7 to row 5 replaces h with 0 and otherwise does 
not change S. In this way we can reduce the matrix S in (7.9) 
by elementary combination similarities so that the only nonzero 
entries in its last column above the u in its last position occur in 
the same row as the last row of one of the Jordan blocks of S’. We 
now assume that S has this form. For instance, in the case S as 
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given in (7.10), we get 


(7.11) 


za ocolt © 


where p and q may be nonzero. 

The digraph D(S ) has a very simple structure. It consists of a 
number of pairwise vertex-disjoint paths (these correspond to the 
Jordan blocks that have only 0’s across from them in column n) 
and, entirely disjoint from them, a number of other paths that, 
except for the fact that they all terminate at the vertex n (cor- 
responding to column n), are also pairwise vertex-disjoint (these 
correspond to the Jordan blocks that have one nonzero entry across 
from their last row in column n). We now show that by elemen- 
tary combination similarities we can replace all but one of the 
nonzero off-diagonal entries in column n of S with 0, again with- 
out changing any other entry of S. The nonzero off-diagonal entry 
that remains is one corresponding to the largest Jordan block of 
S” (if there is more than one such largest Jordan block, we can 
choose one arbitrarily). We refer to the particular S in (7.11), but 
our procedure works in general. The digraph D(S) in this case 
consists of a path of length 2 and a path of length 5 that meet 
in the vertex corresponding to column 7. Assume that p 4 0 and 
q #0. The scalar q is opposite the largest Jordan block. Using an 
elementary diagonal similarity, we may assume that q = 1. With 
this in mind, we now perform a sequence of elementary similarities 
that replaces p with 0 and otherwise makes no change: 


(i) Add —p times row 6 to row 2 and p times column 2 to column 
6. 


(ii) Add —p times row 5 to row 1 and p times column 1 to column 
6. 
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Notice how this uses the fact that the second Jordan block (of 
order 4) has more rows that the first Jordan block (of order 2). In 
this way we reduce S to a Jordan matrix. Hence, by induction, 
we have proved the following important theorem. 


Theorem 7.3.11 Every matrix of order n is similar to a Jordan 
matrix. 


If Jisa Jordan matrix similar to the matrix A, then J is called 
a Jordan Canonical Form of A, abbreviated to JCF. A JCF of A 
is unique apart from the obvious fact that the Jordan blocks may 
occur in any order. For example, 


are both JCFs of the same matrix A. Indeed, such an A would 
have six JCFs, as there are 3! = 6 ways in which to order the three 
different Jordan blocks. On the other hand, the matrix 


J3(5) © J3(5) 


is the unique JCF of a matrix B as there is only one way to list its 
two (identical) Jordan blocks. We do not prove here the general 
uniqueness property of the JCF. 

We have seen how a large part of the proof for a JCF—starting 
from Jacobi’s theorem that a matrix is similar to a triangular 
matrix—can be made graph-theoretical (see [6] and the reference 
to Turnbull and Aitken there). A similar proof also appears in 
[23]. 


7.4 Spectrum of Circulants 


We recall the definition of a circulant matrix from Section 3.2. Let 
P be the permutation matrix of order n defined by 


010- 0 %] 
0001.-00 
0000.-00 
| ss ee ee 
000- 0 1 
100.00 
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Let 
1 


g(x) = ao + aya + age? +--- + anar 


be a polynomial of degree at most n. Then 


A=g(P) = aoln teP 4 coa P? Heer an1 P”! 


is a circulant of order n. 

The digraph D(P) is a cycle of length n and thus P satisfies 
P” = I„. The characteristic polynomial of P is det(A/„— P). The 
digraph D*(XI,, — P) is a cycle of length n with a loop at each 
of its vertices and hence has only two linear subdigraphs, namely, 
the cycle itself and the linear subdigraph consisting of the n loops. 
It follows from the definition of determinant that 


det(AI, — P) = à” + (-1)""1(-1)" =A" — 1. 


Hence the eigenvalues of P are the n nth roots of unity 
1,w,w?,...,wW"1, where w = e?"/® and i is the complex num- 
ber equal to the square root of —1. Because the eigenvalues of 
P are distinct, the Jordan Canonical Form of P has only Jordan 
blocks of order 1. Hence the Jordan Canonical Form of P is the 
diagonal matrix 


1 0 0 0 
Ow 0 0 
D=|0 0 tae 0 
WED g 0 
0 0 0 ion 


We now invoke Corollary 7.3.8 and conclude that the n eigenvalues 
of the circulant C are 


go) (k=0,1,...,n— 1). 


Let 
1 1 1 1 
1 w we grt 
4 2(n—1) 
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Then columns of X are the eigenvectors for the n eigenvalues 
1,w,w?,...,w”! of P and hence 


PX = XD or, equivalently, X~'PX = D. 
Since C = g(P) we also have that 
XCX = GD). 


Hence the columns of X are also eigenvectors for the eigenvalues 


9(1), gv), gw’), g) 


of C. 


7.5 Exercises 


1. Let x and y be eigenvectors for the eigenvalue A of a square 
matrix A, and let a and 3 be real numbers. Prove that 
ax + By is also an eigenvector for the eigenvalue A of A. 


2. Show that the eigenvalues of the matrix al„+bJ, are a (n—1 
times) and a+ nb (once). Here J, is the square matrix of 
order n, each of whose entries is 1. 


3. A square matrix is idempotent provided A? = A. For exam- 
ple, the matrix 
1 1 
0 0 


is idempotent. Prove that 0 and 1 are the only possible 
eigenvalues of an idempotent matrix. (Note that the zero 
matrix is idempotent with each of its eigenvalues equal to 
0, and the identity matrix is idempotent with each of its 
eigenvalues equal to 1.) 


4. Prove that the trace and rank of an idempotent matrix are 
equal. 


5. Let A and B be matrices of order n. Prove that A is an 
eigenvalue of AB if and only if À is an eigenvalue of BA. 
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10. 


11. 


12; 


13. 
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. Let A and B be matrices of order n, at least one of which is 


invertible. Show that AB and BA are similar. 


. Let A be a nonsingular matrix of order n. Determine the 


characteristic polynomial of A~! in terms of the characteris- 
tic polynomial of A. 


. Let A be an invertible matrix of order n. Use the Cayley- 


Hamilton theorem to obtain a polynomial f(x) such that 
AT = f(A). 


. Determine the characteristic polynomial of a general matrix 


of order 4 by means of the Coates digraph. 
Let u and v be vectors in R”. Let A be the matrix uv? of 
order n (the (i, j) of this matrix is ujv; (1 < i,j < n)). Find 


the eigenvalues and eigenvectors of A. 


Calculate the n eigenvalues of the matrix 


0 0 0 ay 
0 0 0 a2 
0 0 0 An-1 
a, Q2 *** Gn-1 Gn 


Determine all possible Jordan Canonical Forms for a matrix 
of order 6, all of whose eigenvalues equal 3. Classify these 
Jordan Canonical Forms according to the geometric multi- 
plicity of 3. 


Find the Jordan Canonical Form of the matrix 
0 1 0 0 
0 0 1 0 
0 001 
1 -4 6 4 


7.0. 


14. 


15. 


16. 


17. 


18. 
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Determine the Jordan Canonical Form of the matrix 


2000 
o 
0 
1 


Coo e.e 
O N e 
=. o.e 


Let A be a matrix of order m and let B be a matrix of 
order n. Let pa(A) be the characteristic polynomial of A. 
Prove that pa(B) is invertible if and only if A and B have 
no common eigenvalues. 


Determine the eigenvalues of the matrix 


1 0—1 20 
0 1 0 -i 
-1 0 1 0 
0 —1 0 1 


For the Jordan block J,(0), show that J,(0)* = O but 
J,(0)8-! 4 O. Let p(x) = (x—a)*. Deduce that the Jordan 
block J,(a) satisfies p;,(J,(a)) = O but pr-ı(Jr(a)) FO. 


Let A be a matrix of order n. Let A1,Aa,...,A; be the dis- 
tinct eigenvalues of A, and in the Jordan Canonical Form of 
A, let the largest Jordan block corresponding to the eigen- 
values A1,A32,...,A; be, respectively, m1, M2, ..., Mı. Let 


q(x) = (8 — Au)" (8 — A)? (a A)". 


Prove that q(A) = O and that if any of the exponents 
Mı, M2,..., Mı in q(x) is decreased, resulting in the poly- 
nomial p(x), then p(A) 4 O. 


Chapter 8 


Nonnegative Matrices 


In this chapter we consider matrices each of whose entries is a 
nonnegative number. These matrices have special spectral prop- 
erties that depend solely on the digraph of the matrix and are 
independent of the magnitude of the positive entries. Some im- 
portant classes of nonnegative matrices, such as irreducible (Sec- 
tion 8.1), primitive, and imprimitive matrices (Section 8.2), are 
defined here, contrary to the standard approach, by properties of 
associated digraphs (strong connectednes, lengths of cycles etc.). 
We discuss, mostly without proof, many of the results of the so- 
called Perron-Frobenius theory of nonnegative matrices (Section 
8.3). This theory represents a basic ingredient of the theory of 
graph spectra where tools from matrix theory are used to study 
graphs (a direction quite opposite from our main stream here since 
we want to show how graphs are used to treat matrices). Section 
8.4 represents a short introduction to graph spectra. 


8.1 Irreducible and Reducible 
Matrices 


A matrix is called nonnegative, respectively, positive, provided all 
of its entries are nonnegative, respectively, positive. In the the- 


ory of nonnegative matrices the notion of irreducibility plays an 
important role, and this is equivalent to the notion of strong con- 
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nectivity for digraphs discussed in Chapter 1. 


Definition 8.1.1 A square matrix A of order n is irreducible pro- 
vided that its digraph D(A) is strongly connected; otherwise, A 
is reducible. Recall from Theorem 1.2.3 that a digraph is strongly 
connected if and only ifthere does not exist a partition of its vertex 
set into two nonempty sets U and W such that each edge between 
U and W has itsr initial vertex in U and its terminal vertex in W. 
Thus if we simultaneously permute the rows and columns of A so 
that the first rows and columns correspond to U, we obtain that 
A is reducible if and only if there is a permutation matrix P such 
that 


(8.1) 


paeo A 


O Z 
where X and Z are square matrices of order at least 1. The matrix 
A is irreducible provided the form (8.1) cannot be achieved for any 
permutation matrix P. Note that the zero matrix in (8.1) is of 
size p by q where p and q are positive integers with p + q = n. 
It follows from the definition that a matrix of order 1 is always 
irreducible. We also note that if we had listed the vertices of W 
first, then we would get a permutation matrix Q such that 
Z O 
T 


with the zero matrix occuring in the lower right corner. 


Recall from Theorem 1.2.3 that a digraph G has l > 1 strong 
components (strongly connected, induced subdigraphs whose sets 
of vertices partition the set of vertices of G) and that these strong 
components can be ordered as Gj, Go, ..., Gi so that the only edges 
between the components are edges whose initial vertex is a vertex 
in G; and whose terminal vertex is a vertex in G,, where 7 < j, 
that is, in the ordering G1, G2,..., Gq, all edges between compo- 
nents go from left to right. We have l = 1 if and only if G is 
strongly connected. Applying this fact to the digraph D(A) of a 
square matrix A, we see that the rows and columns of A can be 
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simultaneously permuted to achieve a block diagonal form called 
the Frobenius normal form: 


There exists a permutation matrix Q such that 


Ay Ais Aig <*> Au 

O Ag As +++ Ax 
QAO? = | O O A + Az], (8.2) 

O O 0 -- A 
where Aj, Ao,..., A; are irreducible square matrices. Because the 
matrices A,, A2, ..., A; correspond to the strong components of 


D(A), and since strong components are uniquely determined (they 
are the equivalence classes of an equivalence relation on the ver- 
tices of D(A)), the matrices A}, A2,..., A; are uniquely deter- 
mined up to simultaneous permutations of their rows and columns, 
that is, up to the order in which the vertices of the strong com- 
ponents are written down. The matrices A1, As,..., A are the 
irreducible components of A. The matrix A is irreducible if and 
only if it has exactly one irreducible component. The order in 
which the irreducible components occur on the diagonal in (8.1) 
is not necessarily unique; it all depends on whether or not the 
matrices A;; are zero matrices. 


Example 8.1.2 The following matrix is in Frobenius normal 
form: 


There are four irreducible components, and the first irreducible 
component could be in any one of the four places; the relative 
order of the other three irreducible components is fixed. 
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An important algebraic characterization of irreducible, non- 
negative matrices is contained in the following theorem. 


Theorem 8.1.3 Let A be a nonnegative matrix of ordern. Then 
A is irreducible if and only if (I + A)! is a positive matrix, 
equivalently, In + A+ A? +---+ A"! is a positive matrix. 


Proof. The matrix [, + A has positive diagonal entries and 
hence the digraph D(I„ + A) has a loop at each vertex. First sup- 
pose that A is irreducible. Then D(I„ + A) is strongly connected 
and for each ordered pair u, v of distinct vertices there is a (short- 
est) path from u to v of length at most n — 1. Because there is a 
loop at each vertex, there is a walk of length exactly n — 1 from 
u to v. Because /„+ A is a nonnegative matrix, all walks have 
positive weights. It follows that (I, + A)"~' is a positive matrix. 
Since 


n— 1 


it also follows that J, + A + A? +- - -+ A"! is a positive matrix. 

Conversely, if (J + A)""! is a positive matrix, then for each 
ordered pair of distinct vertices u,v there is a path of positive 
weight in D(A) from u to v of length n—1. Hence D(A) is strongly 
connected and A is irreducible. 


The following corollary is an easy consequence of Theorem 
8.1.3. 


Corollary 8.1.4 Let A be an irreducible nonnegative matrix of 
order n each of whose diagonal entries is positive. Then A"! is 
a positive matrix. 


8.2 Primitive and Imprimitive 
Matrices 
The cycles of the (strongly connected) digraph D(A) of an ir- 


reducible nonnegative matrix A have a strong influence on the 
spectrum of A. 
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Definition 8.2.1 Let G be a strongly connected digraph. The 
greatest common divisor (abbreviated GCD) d of the lengths of 
the cycles of G is called the index of imprimitivity of G. If d = 1, 
then G is primitive; if d > 1, then G is imprimitive. Note that 
since a closed walk is composed of cycles, in defining d we could 
use the lengths of all the closed walks in G (see also Lemma 8.2.2 
below). Let A be an irreducible nonnegative matrix of order n. 
Then the index of imprimitivity of A is defined to be the index 
of imprimitivity of the digraph D(A), and A is primitive or im- 
primitive according to whether D(A) is primitive or imprimitive. 
Oo 


If the index of imprimitivity d is greater than 1, a certain 
structure is imposed on a digraph and a matrix. First we make 
the following observation. 


Lemma 8.2.2 Let G be a strongly connected digraph with vertices 
1,2,...,n and with index of imprimitivity d. For each vertex i of 
G, let d; be the GCD of the lengths of all closed walks containing 
i. Then d = dı = d2 = - -- = dn. Moreover, the lengths of any two 
walks with the same initial verter and the same terminal vertex 
are congruent modulo d. 


Proof. Consider vertices i and j with i # j. Because G is 
strongly connected, there exists a path y from 7 to 7 and a path 
+’ from j toi. The path y followed by 7 gives a closed walk 0 
containing both 7 and j. Let 0 have length s. Then d; and d; are 
both divisors of s. Thus, for each integer l for which there exists 
a closed walk of length l containing vertex i, there exists a closed 
walk of length s + l containing vertex j. Because d; is a divisor 
of s and of s + l, d; is a divisor of l. Because this is true for all 
such l, we conclude that d; is a divisor of d;. In a similar way one 
shows that d; is a divisor of d;. Thus d; = dj and we conclude 
that dı = də = - -- = dp. The common value must be d. 

Now let yı and y2 be two walks with the same initial vertex 7 
and the same terminal vertex j of lengths kı and ka, respectively. 
There exists a walk y3 from vertex 7 to vertex i of some length t, 
giving two closed walks of lengths kı + t and k + t. Because d 
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divides both kı +t and ka+t, d divides (kı +t) — (ko +t) = kı — kə, 
that is, kı and ka are congruent modulo d. 


Theorem 8.2.3 LetG be a strongly connected digraph with vertex 
set V having an index of imprimitivity equal tod > 1. Then V can 
be partitioned into d nonempty sets Vo, Vi,..., Va-ı such that each 
edge of G has its initial vertex in some V; and its terminal vertex 
in Vis. (subscripts considered modulo d). Thus the subdigraphs 
induced on each of Vo,Vi,...,Va—1 do not contain any edges, and 
the edges of G are arranged in a circular pattern Vo to Vi, Vı to 
Vo, ... , Va to V;_ı, and V;_ı to Vo. 


Proof. Consider any vertex a of G. For i = 0,1,...,d — 1, 
let V; be the set of vertices x to which there is some walk from a 
of length congruent to 7 modulo d (and so by Lemma 8.2.2, every 
walk from a to x has length congruent to i modulo d). Note that 
a belongs to Vo, and the sets Vo, Vi,..., Va_ı are pairwise disjoint 
and nonempty (because there is a cycle containing a and it has 
length at least d). Suppose that there is an arc from vertex u to 
vertex v, where u is in V; and v is in V;. There is a walk from a to 
u of length congruent to 7 modulo d and hence a walk from a to v 
of length congruent to i + 1 modulo d. From the definition of the 
Vs we now conclude that 7 + 1 is congruent to j modulo d, that 
is, modulo d, 7 equals 7 + 1. 


The pairwise disjoint sets Vo, V;,..., Va_ı in Theorem 8.2.3 are 
called the sets of imprimitivity of G. In case d = 1, the vertex set 
V is the unique set of imprimitivity of G. A digraph G with the 
structure as given in Theorem 8.2.3 is called cyclically d-partite. 
Note that if m is a divisor of d, then G is also cyclically m-partite. 

If G is the digraph of an irreducible nonnegative matrix and 
we order the vertices of G as given in Theorem 8.2.3, so that the 
vertices in Vo come first, followed by those in V9, ... , followed by 
those in Vy_ı, we obtain the the following matrix interpretation of 
Theorem 8.2.3. 
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Theorem 8.2.4 Let A be an irreducible nonnegative matrix of 
order n with index of imprimitivity equal to d. Then there exists 
a permutation matrix P such that 


D Aa -O Be, <0 O 
O- On Mme. 0 O 

2 0: O Ox O -Q 

O 30: 20, 20 A 

Wee O70. 0° Kos, 


where the square zero matrices on the main diagonal have orders 
ko, kı,...,ka-ı as indicated (these are the sizes of Wo,Vi,..-, Va-ı 
in Theorem 8.2.3). 


In general, the matrices A; ;+1 in (8.3) are rectangular. Using 
block multiplication, we immediately obtain the following corol- 
lary. 


Corollary 8.2.5 Let A be an irreducible nonnegative matrix of 
order n with index of imprimitivity equal to d. Then there exists 
a permutation matrix P such that 

PA'P" = Bo ® B1 ®-+: @ Baa, 
a block-diagonal matrix whose blocks Bo, By,..., Ba-ı are 


Bo == Agi Aıa i Ag-1,0, Bı = A124923 K Aoi, irig 


Ba-ı = Aa-1,0401 ` + - Ad—2,4-1- 


To conclude this section, we show that some positive integral 
power of a primitive (nonnegative) matrix is a positive matrix. In 
order to do this, we make use of a number-theoretic lemma that 
we state without proof. 
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Lemma 8.2.6 Let dı, d2,...,dp be positive integers whose GCD 
equals 1. Then every sufficiently large positive integer can be ex- 
pressed as a nonnegative linear combination of dı, dz,..., dp. That 
is, there exists a positive integer M such that, for each integer 
m > M, there are nonnegative integers a1, d2,...,a% such that 


m = aıdı + dgdg +--+ + agdg. 


Theorem 8.2.7 Let A be a primitive matrix of order n. Then 
there exists a positive integer p such that AP is a positive matrix. 


Proof. The matrix A” is a positive matrix if and only if, in 
the digraph D(A), for any ordered pair of not necessarily distinct 
vertices u,v there is a walk of length p from u to v. Because A is 
primitive, the digraph D(A) is strongly connected and the GCD 
of the lengths of its cycles equals 1. Let the distinct cycle lengths 
of D(A) be dı,ds,...,d;, where 1 < dj, do,...,d, < n. Because 
D(A) is strongly connected, we can find a walk Yw from u to v 
that contains a vertex of a cycle of each length dı,ds,...,d;. Let 
the length of such a walk be I. We can extend yy, by going 
around the cycles it meets any number of times. Applying Lemma 
8.2.6, we can obtain walks from u to v of any length greater than 
or equal to I,» + M. Now let p be the maximum of the numbers 
luy + M taken over all ordered pairs of vertices u,v. Then there 
is a walk of length p from u to v for all u and v. Hence AP is a 
positive matrix. 


If A is a primitive matrix of order n, then the smallest positive 
power of A that gives a positive matrix is called the exponent of 
A. It is known that the exponent of A is at most n?—2n +2. This 
exponent is achieved by the following matrix of order n with n+ 1 
positive entries: 


0 a b O 0 ] 
0 0 a 0 0 

0 0 0 ag 0 

0 0 0 0 An-ı 
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where a1,43,...,@„,b are all positive. Note that because clearly 
a primitive matrix (or irreducible matrix of order at least 2) has 
at least one positive entry per row and column, it follows that if 
AP is a positive matrix, so are all powers of A greater than p. A 
power of an imprimitive matrix cannot be positive. This follows, 
for instance, from Lemma 8.2.2. Thus we can say that some power 
of a nonnegative square matrix is a positive matrix if and only if 
the matrix is primitive. 


8.3 The Perron—Frobenius Theorem 


The spectra of irreducible nonnegative matrices, in particular of 
positive matrices, have many special properties, which we present 
in this section without proof. First we make a general definition. 


Definition 8.3.1 Let A be a matrix of order n with eigenvalues 
Ai, A2,-+-;An. The spectral radius p(A) of A is the maximum of 
the absolute values of its eigenvalues: 


p(A) = max{|Aa|, [Ar]: 5; [An] }. 


The spectral radius of A is the radius of the smallest circle centered 
at the origin that contains the spectrum of A. This circle is called 
the spectral circle of A. The spectral radius of A is zero if A is a 
nilpotent matrix and is positive otherwise. 

Let A be an irreducible nonnegative matrix of order n. If 
n > 1, then the digraph D(A) has a closed walk and hence cannot 
be nilpotent; hence A has positive spectral radius. 


Example 8.3.2 Let 
0 1 0 
A=|0 0 1], 
000 


a nonnegative reducible matrix each of whose three irreducible 
components is the zero matrix of order 1. Then A? = O, and the 
eigenvalues of A are 0,0,0. Hence the spectral radius of A is 0. 
Note that A is the Jordan block J3(0). 
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The following theorem gives an elementary bound for the spec- 
tral radius of a matrix. 


Theorem 8.3.3 Let A = [aij] be a matrix of order n. Let 


be the sum of the absolute values of the entries in rowi of A. Then 
p(A) < max{r1, ar 
A similar inequality holds for the sum of the absolute values of the 
entries in the columns of A. 
Proof. Let A be any eigenvalue of A and let x = [x1 £2 ... zn]? 
be a corresponding eigenvector: Ax = Ax. Let |x| = max{|z;] : 


1 <i < n} > 0 be the largest absolute value of an entry of x. 
Then, taking absolute values in the equation 


n 
> Ant; = Ave 


j=l 


and using the triangle inequality, we obtain 


n 
Alex] = [Aral = |X arz; 
j=l 
n 
< lan; |[25| 
j=1 
n 
< (X lanyl) lanl = relat. 
j=l 
Cancelling |x;|, we obtain [A| <r, < max{rı,ra,... tn}: 


In the next theorem we summarize the most important and 
basic consequences of the Perron-Frobenius theory of nonnegative 
matrices. In order to avoid the trivial situation of a zero matrix of 
order 1 (which is an irreducible nonnegative matrix), we assume 
that n > 1. 
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Theorem 8.3.4 Let A be an irreducible nonnegative matrix of 
ordern > 1. Then 


(i) 


(iv) 


The spectral radius p(A) of A is an eigenvalue of A, that is, 
A has a positive eigenvalue r that is greater than or equal to 
the absolute value of every eigenvalue of A. The number r, 
which is the same as the spectral radius of A, is sometimes 
called the Perron eigenvalue of A. 


The algebraic multiplicity, and so the geometric multiplicity, 
of the Perron eigenvalue r equals 1, that is, r is a simple 
root of the characteristic polynomial of A. 


Corresponding to the Perron eigenvalue r there is a positive 
eigenvector y: Ay = ry, where y is a positive vector. The 
vector y, and each of its positive multiples, is called a Perron 
vector of A. The matrix A has no other nonnegative eigen- 
vectors (corresponding to any eigenvalue) other than positive 
multiples of its Perron vector. 


Let h be the index of imprimitivity of A. Then A has exactly 
h eigenvalues whose absolute value equals r, that is, there 
are exactly h eigenvalues on the spectral circle of A. The h 
eigenvalues of A on the spectral circle are the roots of the 
equation A? — r” = 0, that is, the numbers 


reiih El.) 
In fact, the entire spectrum of A is mapped into itself under 
a rotation of the plane about the origin through an angle of 


2T/h. 


If A’ is a principal submatrix of A, then p(A') < p(A) with 
equality if and only if A’ = A. 


If B is a nonnegative matriz with B < A (entrywise), then 
p(B) < p(A) with equality if and only if B = A. 
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Example 8.3.5 Let P, be the permutation matrix whose digraph 
D(P) has n edges arranged in the cycle that goes from 1 to 2, from 


2 to 3,... , from n — 1 to n, and from n to 1. For instance, 
0100 
0010 
0.0.01 
1000 


The matrix P, is an irreducible nonnegative matrix with spectral 
radius equal to 1 and with index of imprimitivity equal to n. Its 
n eigenvalues are 


eriin (j =0,1,...,n - 1). 


When j = 0, we get the Perron eigenvalue 1. A Perron eigenvector 
is the vector of all 1’s or, more generally, a constant vector each 
of whose entries is a positive number c. 

Now let 


HHrHrHeooo 
HrHrHrooo 
HHrHrHrooo 

oOoOOHH-H- 
ooorrr 
ooorrr 


0 


The matrix A is the adjacency matrix of the complete bipartite 
graph K33. Then A is irreducible and has an index of imprimitiv- 
ity equal to 2, and A? = 3J3@3J3. The matrix Jz has eigenvalues 
3,0,0 and hence A? has eigenvalues 9,9,0,0,0,0. The eigenvalues 
of A are 3,—3,0,0,0,0 (because A has trace equal to zero, the 
sum of the eigenvalues equals 0). Hence the Perron eigenvalue of 
A equals 3 and a Perron vector is [1 1 1 1 1 1]7. 


By Theorem 8.3.3, if A is an irreducible nonnegative matrix, 
then the maximum sum of the elements in a row of A is an upper 
bound for the spectral radius of A. Theorem 8.3.4 enables us to 
obtain a lower bound as well. 
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Theorem 8.3.6 Let A = [a;;| be an irreducible nonnegative ma- 
trix of ordern > 1. Let 


Then 


min{r;:1 <i <n} SpA) max{r;:1 <i <n}. 


Equality occurs on the left if and only if it occurs on the right, and 
this happens if and only if ry = r2 =: = Tn. 


Proof. Let y = [y1 ye «++ Yn]! be a Perron vector correspond- 
ing to the Perron eigenvalue p(A). Then y is a positive vector. 
Let 


Ys = min{y1, Ya,.-- , Yn? and Yy = max{yı, Y2,- Ynt- 


Then ys, y > 0, and from the equations 


X asjyj = p(A)ys and X` ayy; = p(A)yı 
j=l 


j=l 
we get, similar to the proof of Theorem 8.3.3, that 

YsTs < pl(A)ys and p(A)yı < yırı, 
and hence 


mini, <1 <i <n} SpA) < maxri: 14 <i}. (8.4) 


It is straightforward to check that equality holds in either of the 
two inequalities in (8.4) if and only if the Perron eigenvector y is 
a constant vector. But a constant vector is a Perron eigenvector 
if and only if rı = rg = - - - = rn, and the theorem now follows. 


As we have seen, the Perron-Frobenius theory of nonnegative 
matrices depends substantially on the zero-nonzero pattern of a 
matrix, and this translates to the digraph. For instance, an irre- 
ducible matrix becomes a strongly connected digraph; for more on 
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this, one may consult [3] and [7]. We conclude this brief introduc- 
tion to spectral properties of nonnegative matrices by mentioning 
some applications to the adjacency matrices of graphs (more gen- 
erally, multigraphs). (The Perron-Frobenius theorem can also be 
applied to the adjacency matrix of a multidigraph, but we shall 
not go in this direction.) 


8.4 Graph Spectra 


In the theory of graph spectra (see, for example, [18], [71], [24]), 
the results of matrix theory are used for investigations of graphs. 
In this section we present some basic properties of graph spectra. 
In Sections 10.2 and 10.3 we discuss some applications of graph 
spectra in physics and chemistry that also provide motivation for 
founding the theory. 

We start with the following definition: 


Definition 8.4.1 Let G be a multigraph whose vertex set is 
{1,2,...,n}, and let A = [a,;] be an adjacency matrix of G. Then 
aij equals the number of edges between vertices 7 and j and hence 
A is a nonnegative, symmetric integral matrix of order n. By 
Theorem 7.1.9. because A is a real symmetric matrix, each of its 
eigenvalues A,, A2,..., An is real, and we may choose our notation 
so that 
Neha d Aw 


The characteristic polynomial of A is called the characteristic poly- 
nomial of the multigraph G, and the eigenvalues of A are called 
the eigenvalues of the multigraph G. The spectrum of the multi- 
graph G is the collection of its n eigenvalues A1, A2,..., An. By 
Theorem 8.3.4, the eigenvalue r = A; is the spectral radius of A 
and àn > —r. Thus the spectrum of G lies in the interval [—r,r], 
where r is the largest eigenvalue. The eigenvalue r is called the 
index of the multigraph G. 


We now restrict our attention to graphs G. Thus G has no 
loops and at most one edge joins each pair of vertices. The ad- 
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jacency matrix A of G is a symmetric matrix of 0’s and 1’s with 
only 0’s on the main diagonal. 


Example 8.4.2 The complete graph Kn has adjacency matrix 
A = Jn- In. All the row and column sums of A equal n — 1 
and so, by Theorem 8.3.6, the index of K,, equals n — 1. We have 
(-1)m-A = —J,. Because — J, has rank 1, this implies that —1 is 
an eigenvalue of A with geometric, and thus algebraic, multiplicity 
equal to n— 1. Thus the eigenvalues of K, aren—1,0,...,0 (n—1 
zeros). 

Now let G be the complete bipartite graph K,,. Then an 
adjacency matrix is 


Op A 
Jap Og 


Squaring A we see that 
2 qJp O 
A = | A | | 
The eigenvalues of A? are 


pq, 0, ...,0,pq,0,...,0 
— — 
p-1 q—1 


and, because the trace of A equals 0, the eigenvalues of Kp, are 
+,/pq followed by (p + q — 2) 0’s. 

If G is a connected bipartite graph with index r, then all cycles 
have even length and hence the index of imprimitivity of its adja- 
cency matrix (regarded as an adjacency matrix of a digraph and 
so with an edge from a vertex u to a vertex v if and only if there 
is an edge from vertex v to u as well) is a multiple of 2; hence it 
follows from (iv) of Theorem 8.3.4 that the spectrum is symmetric 
about zero, in particular, —r is also an eigenvalue of G. If G is 
not bipartite, then the index of imprimitivity of G is 1, and hence 
—r cannot be an eigenvalue of G. 


Theorem 8.4.3 Let G be a graph whose eigenvalues are Aı > 
Ag >... > An, let r = à be the index of G, and let s = A be the 
smallest eigenvalue of G. Then 
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(i) The eigenvalues A1,Aa,...,An are real and satisfy 


tigt + An = 0. 


(ii) The number of edges of G equals 


AE HAZ HA HM, 
aca are 


(iii) If G has no edges, then all if its eigenvalues equal 0. 


(iv) IfG has at least one edge, then 1 <r <n-land-r < s < 
—1. We have r = n— 1 if and only if G = Kn andr = 1 if 
and only if each connected component of G is either Kı or 
Ky (there must be at least one Ka because G has at least one 
edge). We also have s = —1 if and only if each connected 
component of G is a complete graph, and s = —r if and only 
if the connected component of G with the largest index is a 
bipartie graph. 


Proof. Let A be the adjacency matrix of G. Then A is sym- 
metric with trace equal to zero. Hence (i) holds. Then the number 
of edges of G equals the number of closed walks of length 2, and 
this equals the trace of A? divided by 2 (because each edge has 
two vertices). Because A? has eigenvalues \?, \3,..., A2, (ii) holds. 
The adjacency matrix of a graph with no edges is a zero matrix, 
and (iii) follows. Because the index of Kn is n— 1, every subgraph 
of Kn not equal to Kn has a strictly smaller index by (v) and (vi) 
of Theorem 8.3.4. Because G has at least one edge, Ka is a sub- 
graph of G where the spectrum of Ka is 1,—1. The assertions in 
(iv) about r now follow easily. Because Ky has least eigenvalue 
equal to —1, it follows from the interlacing theorem for symmet- 
ric matrices [47] that q < —1 with equality if and only if each 
connected component is a complete graph. That q > —r follows 
since the index of G is at most r and is the largest eigenvalue in 
absolute value. That q = —r if and only if the connected compo- 
nent of largest index is bipartite follows from the last assertion in 
Example 8.4.2. 
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When considering determinants of symmetric matrices, in par- 
ticular adjacency matrices of graphs, it is useful to introduce a 
concept that is related to linear subdigraphs. 

Let G be a graph. A subgraph of G whose components are 
circuits or graphs Ko is called a basic figure of G. A basic figure 
is a spanning basic figure if it contains all vertices of G. If U is 
a basic figure, then p(U) denotes the number of components and 
c(U) the number of circuits of U. 


Lemma 8.4.4 LetG be a graph on n vertices with adjacency ma- 
trix A. Then 


det A = (-1)" I (—1 Pa), 
where the summation extends over all spanning basic figures of G. 


Proof. The Coates digraph D(A*) of A is obtained from G by 
replacing each edge of G with a cycle of two vertices. Because all 
nonzero entries of A equal 1, the weight of any linear subdigraph of 
D(A*) is equal to 1. Each linear subdigraph of D(A*) can be ob- 
tained from a spanning basic figure of G by replacing each isolated 
edge (i.e., component equal to Ka) by a cycle of two vertices, and 
by introducing an orientation to each edge of each circuit in one 
of the two possible ways so that we get a cycle. Therefore, start- 
ing from a spanning basic figure U, we can construct 2°) linear 
subdigraphs. Now formula (4.1), which defines the determinant of 
a matrix, reduces to the formula in the lemma. 


We now obtain a formula for the characteristic polynomial of 
a graph. 


Theorem 8.4.5 The characteristic polynomial of a graph G onn 
vertices is given by 


ie z 
D 
i=0 


where 
a; = Ya), GG = 0,1,...,n) 
U; 
and the summation extends over all basic figures of G with i ver- 
tices. 
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Proof. If we apply formula (7.2) to the adjacency matrix 
A of G, we get that a; is equal to (—1)’ times the sum of the 
determinants of all the principal submatrices of A of order i. The 
result now follows from Lemma 8.4.4. 


In the case of a forest, in particular a tree, we obtain a simpler 
expression for the characteristic polynomial. 


Corollary 8.4.6 The characteristic polynomial of a forest W with 
n vertices is equal to 


NDS 
Fruit 


(—1)"*m(W, RAR, (8.5) 


k=1 


where m(W,k) is the number of k-matchings in W. 


Proof. In a forest, basic figures with an odd number of vertices 
do not exist. For even i = 2k, a basic figure with 7 vertices is just 
a k-matching. The corollary now follows from Theorem 8.4.5. 


8.5 Exercises 


1. Explain how the Frobenius normal form of a permutation 
matrix of order n is determined. 


2. Determine the Frobenius normal form of the matrix 


0100010 
0000100 
00.3 0 0 0 1 
010 0 2 0 0 
00010 0 2 
1000000 
0020000 
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3. Determine the Frobenius normal form of the matrix 


000101 
12200 1 
1 0 10 1 1 
200003 
031100 
100003 


4. Show that the following matrices are primitive and determine 
their exponents: 


01000 0 010000 
001000 001000 
000100 000100 
000010] ™)000010 
to 0050 A 000204 
iM Com pg me Oa 110000 


5. Show that a primitive matrix of order n > 2 contains at least 
n + 1 positive entries. 


6. Show that if A is primitive, so is A* for every positive integer 


k. 


7. Let 


> 

|| 
Hoooo 
ooorr 
ooroo 
oH-ooo 


A 


— oooO - 


Construct the digraph D 
with exponent 17. 


rn 


and show that A is primitive 


8. Let A be an irreducible nonnegative matrix with at least 
one positive diagonal element. Prove that A is primitive 
with exponent at most 2(n — 1). 
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9. Use the examples below to show that a nonnegative reducible 
matrix may or may not have a positive eigenvector: 


foo} [ro] 


10. Determine the Perron root and Perron eigenvector of the 


matrix 
2 3 
1 2|° 


11. Determine the eigenvalues of the graph obtained from the 
complete graph K; by removing an edge. 


12. Prove that the largest eigenvalue of a regular graph of degree 
r is equal to r. 


13. Let G be a regular graph of degree r with characteristic 
polynomial p(A). Determine the characteristic polynomial 
of the complement G of G obtained by joining two vertices 
by an edge in G if and only if they are not joined in G. 


14. Check whether the graphs Kı, and C4 U Kı have the same 
spectrum.! 


!Nonisomorphic graphs that have the same spectrum are called cospectral 
graphs. 


Chapter 9 


Additional Topics 


In this chapter we first introduce some special matrix products 
(the tensor product and Hadamard product) and prove some of 
their properties. In Section 9.2, given a square matrix, we show 
how to determine regions in the complex plane that are sure to 
contain all of its eigenvalues. In Section 9.3, we introduce an im- 
portant combinatorial counting function, called the permanent, 
which, although similar to the determinant in definition, is noto- 
riously difficult to compute in general. 


9.1 Tensor and Hadamard Product 


Let A = [a;;| and B = [b;;| be matrices of sizes m by p and q by n, 
respectively. If the number p of columns of A equals the number 
q of rows of B, then, as we know, A and B can be multiplied to 
give an m by n matrix whose (i, j)-entry equals 


p 
> dikbkj IE Sari 
k=1 


There are other special products of matrices that are often useful 
in applications. 


Definition 9.1.1 The tensor product (also called the Kronecker 
product) of A and B in this order is the mq by pn matrix A® B 
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obtained from A by replacing each entry a;; of A with the q by n 
matrix 
Gage Ae Sons Pp) 


The tensor product has a natural partitioned form. If A and B 
have the same size, that is, m = q and n = p, then the Hadamard- 
Schur product or entrywise product is the m by n matrix 


Ao B= [ajbi] 
obtained by multipying corresponding entries of A and B. 


Example 9.1.2 Let 


Aelia? E 


4 75 5 3 
Then 
2B 6B 3B 
oz ee 7B = 
2 Be | 6. 24|3 12 
_ 10 6 |30 18|15 9 
If 
4 -3 5 
E 1 a: 
then 


T 2(4) 6-3) 3(5)]_ [ 8 -18 15 
abah 7(1) TE 7 I 


Assume now that A = [a;;| and B = [bx] are square matrices 
of orders m and n, respectively. Let the vertices of the digraph 
D(A) of A be {1,2,...,m}, and let the vertices of the digraph 
D(B) be {1,2,...,n}. Then the digraph D(A® B) has vertices 


{li j): <a Said a ny}, (9.1) 
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with an edge from a vertex (i,j) to a vertex (k,l) if and only if 
there is an edge from i to k in D(A) and an edge from j to lin 
D(B). It is natural to label the rows and columns of A® B with 
the ordered pairs given in (9.1). With these labels, the rows and 
columns of A ® B occur in the order 


(1,1),..., (1,n), OA rk (mM, 1),..., (m,n). 
Moreover, the entry caj) (œp in position ((i, j), (k, tl)) is given by 
C(i,3),(k,l) = Any. 


The digraph D(A®B) is the tensor product (or Kronecker product) 
of the digraphs D(A) and D(B). As a weighted digraph, the edge 
from (i, j) to (k,l) has weight a;j;brı, the product of the weights of 
the edge from i to k in D(A) and the edge from j to lin D(B). 

Although A® B and B® A have the same size, it is not true 
in general that A® B = B & A. For example, if A is the identity 
matrix and Ja is the matrix of order 2 each of whose entries equals 
1, then 


1® Ja = £h®Rh= 


Although A & BZ B®A in general, the following theorem about 
their digraphs does hold. 


Theorem 9.1.3 There exists a permutation matrix P of order 
mn such that 

P(A®B)P"=B®A. 
Equivalently, the digraphs D(A®B) and D(B®A) are isomorphic 


with the isomorphism preserving weight. 


Proof. In the weighted digraph D(A @ B) there is an edge 
from (i, j) to (k,l) of weight aibi. Similarly, in the weighted 
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digraph D(B® A) there is an edge from (j,i) to (l, k) of weight 
b,ıair = Qirbjı. Thus the bijection 


(ikh GD) —> (0,9, (4,4) < i,j <m,1<k,l<n) 


is an isomorphism of the two weighted digraphs. 


In the following theorem we collect a number of elementary 
identities for the tensor product. 


Theorem 9.1.4 If A, B, and C are matrices of appropriate sizes 
in order to carry out the indicated operations, the following hold: 


(i) (associative rule) AB(B®C) =(ASB)®C. 

(ii) (distributive rule) A®(B+C)=A®B+A6C. 
ii) (distributive rule) (A+B) ®C=A®@C+B®C. 
(transpose rule) (A & B)? = AT @ BT. 

(product rule) (A ® B)(C & D) = AC & BD. 


(iii 
(iv 
(v 


Proof. Identities (i)-(iv) can be verified in a straightforward 
manner. We now verify (v). In order for (v) to make sense, the 
number of columns of A has to equal the number of rows of C, 
and the number of columns of B has to equal the number of rows 
of D. 

The entry in position ((i, 7), (k,1)) of (A @ B)(C ® D) equals 


> >> Gipdjq ` Cpkdq = > QipCpk ` > bjgdg, 
p q p q 


and this is the same as the entry in the ((i, j), (k,1)) position of 
AC®BD. 


) 
) 
) 
) 


Note that if A and B are square matrices of orders m and n, 
respectively, then (v) implies that 


A®B=(ABL)Im®B). 


The product rule (v) in Theorem 9.1.4 has some useful and, 
in some cases, surprising consequences for the tensor product of 
square matrices. 
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Theorem 9.1.5 Let A and B be square matrices of orders m and 
n, respectively. 
Then the following hold: 


(i) If A and B are invertible, then AQ B is invertible and (A® 
B)1=A@B, 


(ii) If A1,A2,...,Am are the eigenvalues of A and p, H2, ..., Um 
are the eigenvalues of B, then the eigenvalues of AQ B are 
the mn products of the eigenvalues of A with the eigenvalues 
of B: 


(iii) det(A & B) = (det A)" (det B)™. 


Proof. To establish (i) we use the product rule for tensor 
products to compute 


(A8 B)(A™t & B7!) = (AA™!) OBR EI & In = Imn. 


Thus A~! & B~! is the inverse of A & B. It is easy to establish 
that if A is an eigenvalue of A and u is an eigenvalue of B, then 
Au is an eigenvalue of A & B. We simply choose an eigenvector 
u #0 of A for A and an eigenvector v 4 0 of B for u. Then u®v 
is not the zero vector and, by the product rule, 


(A8 B)(u® v) = (Au) 8 (Bv) = (Au) 8 (uv) = ulu & v). 


In order to know that the entire collection of eigenvalues of A & B 
is as given in (ii) (that is, that the multiplicities work out), we use 
the Jordan canonical forms J4 and Jg of A and B, respectively. 
There exist invertible matrices P and Q such that P"!AP = J, 
and QBQ7' = Jg, and, by (i), P@Q is invertible with (P&Q)"! = 
P- & Qt. Using the product rule, we get that 


(POQ) (A8 B)(POQ) = (P7 8Q7)(A8 B)P8Q) 
= (PAP) 8 (QBQ) 
=< Ja ® JB. 
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Now J, is a triangular matrix with A1,Aa,...,Am on its main 
diagonal, and Jg is a triangular matrix with 41, H2,...,Hn on its 
main diagonal. The matrix J4 ® Jg is then a triangular matrix 
with the mn numbers Au; (1 <m < n,1 < j < n) on its main 
diagonal. Because the eigenvalues of a triangular matrix are its 
diagonal entries, this establishes (ii). 

To prove (iii), we simply note that det A = A,A2q---Am and 
det B = pi fl2+++ fn, and use (ii) to conclude that 


I 
— 
= 


—1)J 
= (Ada: Am)" (mike Hn)” 
(det A)" (det B)”. 


We only briefy discuss the Hadamard-Schur product AoB = 
[a;;b;;| of two square matrices A = [a;;] and B = [b;;] of order n. 

The matrix Ao B is a principal submatrix of the tensor product 
AQB. In fact, using our labeling of the rows and columns of AQ B, 
if we let K = {(1, 1), (2,2),...,(m,)}, then Ao B is the principal 
submatrix (A®B)[K, K] of ABB. The weighted digraph D(AoB) 
is obtained from the weighted digraphs of A and B by multiplying 
the corresponding weights. In particular, there is an edge from 
vertex 2 to vertex j of nonzero weight, if and only if there is an 
edge from i to j of nonzero weight in both D(A) and D(B). In 
unweighted terms, the edges of D(A o B) are the edges common 
to D(A) and D(B). 


9.2 Eigenvalue Inclusion Regions 


Usually the eigenvalues of a square matrix cannot be determined 
exactly and, as a result, it is useful to determine regions in the 
complex plane that include all the eigenvalues and which can easily 
be computed. The first theorem giving an eigenvalue inclusion 
region is the theorem of Gersgorin proved in 1931. To state this 
theorem we require a few preliminaries. 
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Let A = [a] be a matrix of order n. Let 


ri(A) =Ù laj] a= Die oi) 


j#i 


be the sum of the absolute values of the entries in row 7 with the 
entry a;; on the main diagonal deleted. In addition, let 


T;(A) = {z : z a complex number, |z — aul < r;(A)} 


be the disk in the complex plane centered at a; with radius r;(A), 
called the ith Gersgorin disk of A. Finally, let 


P(A) = UR r:(A) 


be the union of all the Gersgorin disks of A. Then T (A) is a union 
of disks in the complex plane and is called the Gersgorin region of 
A. 


Example 9.2.1 Let 
2 4 
[23] 
Then T}(A) is the disk centered at the point (2,0) on the real axis 


with radius 4 and ['2(A) is the disk centered at the point (0,1) on 
the imaginary axis with radius 3. Now let 


13 5 
A= 20 4 
—1 3 -2 


Then T,(A), T2(A), P3(A) are, respectively, the disks centered at 
(1,0), (0,0), and (—2,0) with radii 8, 6, and 4, respectively. 


The theorem of Geršgorin is that the Geršgorin region of a 
matrix contains all its eigenvalues. 


Theorem 9.2.2 Let A = [a,;| be a matrix of order n. Then all 
the eigenvalues of A are contained in its Gersgorin region (A) = 
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Proof. Let A be an eigenvalue of A and let x = [£1 £3 .. an]? 
be an eigenvector of A corresponding to A. Then Ar = Ax implies 
that 4 

Ya: = Az; (i = 1,2,... n). (9.2) 
j=1 
Because x is an eigenvector, x has an entry different from zero. 
We choose k so that 


7x] = max{ |æ], |], --.; |an|}. 


Then |x| > 0, and we consider the kth equation in (9.2) and 
obtain e 

SS ath = Atp. 

j=1 
Rewriting this equation by grouping together the two coefficients 
of £k, we get 

(A — Akk) Lk = 5 Arjtj- 
j#k 

We now take the absolute value of both sides, and, using the tri- 
angle inequality, we obtain 


| do anges 

JAk 

< X laxliz;l 
JAk 

< Jar; ||er| 
JAk 


|A = arr||xz] 


Because |z| > 0, we obtain upon cancellation that 
| = Ark < rr(A). 


Thus A is in the kth Gersgorin disk, and hence in the Gersgorin 
region. 


As a corollary, we obtain an upper bound on the largest abso- 
lute value of an eigenvalue of a matrix. 
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Corollary 9.2.3 Let A = [aij] be a matrix of order n. Let 
r(A)=max{) |ai;|: 1 <i < n}, 
j=l 


the maximum of the sum of the absolute values of the entries in 
each row of A. Then all the eigenvalues of A are contained in the 
disk 

{z : z a complex number, |z| < r(A)}. 


Proof. Let A be an eigenvalue of A. By Theorem 9.2.2, there 
exists a k such that A is in the kth Geršgorin disk: 


| = Ark < rp(A) = 5 laxi|- 
i£k 
Because for complex numbers x and y, |x|—|y| < |e-y|, we obtain 
IA] — Jarı| < DI lazil, 
i£k 
that is, 
A < $ lan: < r(A). 


i=1 


Theorem 9.2.2 implies a result that gives a sufficient condition 
for the invertibility of a matrix. We state this as another corollary. 
Call a matrix A = [a,;| diagonally dominant provided 


IFi 


Corollary 9.2.4 Let A = [aij] be a diagonally dominant matrix 
of ordern. Then A is invertible. 


Proof. We know that a matrix is invertible if and only if 0 is 
not an eigenvalue of A. The diagonal dominance condition (9.3) 
implies that none of the Gersgorin disks 


T;(A) = {z: z a complex number, |z — ail < r;(A)} 
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contains the complex number 0. Because by Theorem 9.2.2 the 
eigenvalues of A are contained in the union of these Gersgorin 
disks, 0 is not an eigenvalue of A and A is invertible. 


It is natural to ask about the boundary of the Gersgorin re- 
gion T(A) of A. We state without proof the following theorem of 
Taussky. 


Theorem 9.2.5 Let A be an irreducible matrix of order n, that 
is, the digraph D(A) is strongly connected. Then, if an eigenvalue 
à lies on the boundary of the the Gersgorin region, then A is on 
the boundary of every one of the n Gersgorin disks, that is, 


[A — aul = ri(A) (w= Re IR 


We now turn to showing how knowledge of the digraph of a 
matrix can be used to obtain refined eigenvalue inclusion regions. 
We need a few preliminaries. 

Let D be a digraph with vertex set {1,2,...,n}, where there is 
a weight w; associated with each vertex i (1 < i < n). Thus D is 
a vertex-weighted digraph. Let u be a vertex of D and let (u,v) be 
an edge leaving u. Then (u,v) is a dominant edge from u provided 
that for each edge (u, x) leaving u, we have w, > w,. Thus the 
edge (u,v) is a dominant edge from u provided there is no edge 
from u to a vertex of larger weight than the weight of v. Now 
consider a cycle y = aj, a2, ..., Qk, Q1. Then y is a dominant cycle 
in D provided that each of its edges (a1, a2),..., (@ax-ı, ak), (ak, a1) 
is a dominant edge. 


Lemma 9.2.6 Let D be a vertex-weighted digraph such that each 
vertex has a positive outdegree. Then D has a dominant cycle. 


Proof. We start at any vertex x, of D and choose a dominant 
edge (x1, £2) leaving xı. Then we choose a dominant edge (2, £3) 
leaving zə. We continue like this until we first repeat a vertex, say 
vertex £k, thereby obtaining a cycle 


Y = Tk, k+l; -Tp = Tk. 
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Then y is a dominant cycle. 


Now let A = jay] be a matrix of order n and consider the 
quantities 
r,(A)=) layl G=1,2,...,n). 
JFi 


Let Do(A) be the digraph obtained by removing all loops (cycles 
of length 1) from the digraph D(A). We regard Do(A) as a vertex- 
weighted digraph where the weight of each vertex i equals r;(A). 
We may also regard Do(A) as a vertex-weighted digraph where the 
weight of vertex i is |a| (i = 1,2,...,n). We then have the follow- 
ing theorem, which is a generalization of Corollary 9.2.4. (We shall 
reverse the order above in which the eigenvalue inclusion region 
given by Theorem 9.2.2 gave a condition (diagonal dominance) for 
a matrix to be invertible by proving first an invertibility theorem 
and obtaining from it an eigenvalue inclusion region.) 

If y is a cycle of a vertex-weighted digraph, then by I], w; we 
mean the product of the weights of all the vertices 7 of y. 


Theorem 9.2.7 Let A = jay] be a matrix of order n each of 
whose entries on the main diagonal is different from zero. Assume 


that 
II lanl > [[ (4) (9.4) 


for all cycles y of D(A) of length at least 2. Then A is an invertible 
matrix. 


Proof. The graphs D(A) and Do(A) have the same cycles 
of length at least 2. We show that A is invertible by showing 
that det A # 0. The determinant of a matrix is the product of 
the determinants of its irreducible components. Because we are 
assuming that the entries on the main diagonal of A are different 
from zero, we assume that A is an irreducible matrix of order 
n > 2; thus D,(A) is a strongly connected digraph with at least 
two vertices. 

Assume to the contrary that det A = 0. Then the rank of A is 
strictly less than n and there is a nonzero vector x = |x £2 ... £n] 
such that Ar = 0. Let I = {i : x; # 0} and let A’ = A[|I, I] be 
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the principal submatrix of A obtained by deleting those rows and 
columns whose index does not belong to J. Let x’ be obtained 
from x in a similar way. Then each coordinate x} of x’ is different 
from zero and 


Ale! = 0. (9.5) 


Because each entry on the main diagonal of A’ is nonzero, (9.5) 
implies that each vertex of Do(A’) has an edge leaving it. We 
weight the vertices of Do(A’), that is, those 7 in J, by |x;| and apply 
Lemma 9.2.6 to obtain a dominant cycle y = 71, i2, . . . , tp, tp41 = tp 
of Do(A’) of length p > 2. From (9.5) we get that for 1 <j <p, 


Qijij tij == > Ii;kUr- 
kel\{i;} 


Using the triangle inequality and the fact that y is a dominant 
cycle in Do(A’) we obtain 


la, ||, < oy [ijn] |2x| 
keI\{i;} 


< | > am) Bi 
kel\{i;} 
< Tij (A) |ia 


Multiplying the last inequalities for 7 = 1,2,...,p we obtain 


Head Hals TEATA: 


y 


and because x; # 0 for j in J, 
II lail < [r4 
Y Y 


This last inequality contradicts (9.4), and the proof is complete. 


We remark that if the matrix A in Theorem 9.2.7 is irreducible, 
the assumption that the entries on the main diagonal are different 
from zero is implied by (9.4). This is because D(A) is then strongly 
connected and every vertex is on a cycle. 
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We now obtain the eigenvalue inclusion region corresponding 
to Theorem 9.2.7. For simplicity, we assume that our matrix is an 
irreducible matrix of order at least 2. For each cycle y of D(A) of 
length at least 2, we define the lemniscate 


Fh fz [Ile = aul < Ina}. 


Theorem 9.2.8 Let A = [a;;| be an irreducible matrix of order 
n>2. Then all the eigenvalues of A are included in the region of 
the complex plane specified by the union of Z, taken over all cycles 
y of D(A) of length at least 2. 


Proof. Let A be an eigenvalue of A and consider the singular 
matrix AJ, — A. The graphs D(A) and D(XI,, — A) have the 
same cycles of length at least 2, and r;(A) =r;(AI„ — A) for each 
i=1,2,...,n. Because the matrix AJ, — A is singular, it follows 
from Theorem 9.2.7 that there is a cycle y of D(A) of length at 


least 2 such that 
Y y 


Thus A is in Z, and the theorem holds. 


Special cases of Theorems 9.2.7 and 9.2.8 are contained in the 
following theorem of Brauer. We leave it as an exercise to provide 
the proof. 


Corollary 9.2.9 Let A = |a;;| be a matrix of order n > 2. If 
lanajj| > r(A)r;(A) (St<j <n), 


then A is invertible. The eigenvalues of A are all contained in the 
region of the complex plane specified by the union of the ovals 


ip =e |z ag |e — a;l < r;(A)r;(A) (iy ey <n). 


By use of the digraph, we were able to give a substantial gen- 
eralization of the Gersgorin inclusion region for the eigenvaues of 
a matrix. One can consult [7] and [78] for a lot more on this topic, 
including proofs of results not given here. 
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9.3 Permanent and SNS-Matrices 


Let A = [aij] be a matrix of order n. The definition of the perma- 
nent of A follows the classical definition of the determinant given 
in Theorem 4.4.2 but with a simplification. Ironically, this sim- 
plification in the formula makes it more difficult to compute the 
permanent. 

The permanent of A is the number given by the formula 


per A = 5 A171, 42j9 °° " Anjns (9.6) 

(31,32. In)EIn 
where the summation is over all permutations (j1, Jo,---,Jn) of 
{1,2,...,n}. Thus, unlike the determinant, we don’t put a minus 


sign in front of some of the terms in the summation in (9.6). In 
the permanent we compute all possible products of n entries of 
A provided these n entries come from different rows and different 
columns. As a result, the permanent does not change if we per- 
mute the rows of A and permute the columns of A. In addition, 
the permanent does not change when a matrix is transposed. An 
equivalent way to define the permanent uses the weighted Coates 
digraph D*(A). Recall that, according to Definition 4.1, the de- 
terminant of A is given by 


det A= (-1)" X (-1)w(L) 
LEL(A) 


where the summation is over all linear subdigraphs of D*(A) and 
the weight w(L) of L equals the product of the weights of its edges. 
The corresponding formula for the permanent is the simpler 


per A= Š w(L). 
LEL(A) 


We record the basic, easily verifiable, properties of the perma- 
nent in the next lemma and leave their verification to the reader. 


Lemma 9.3.1 The following properties hold for a matrix A of 
order n: 
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(i) 


(ii) 
(iii) 
(iv) 


(viii) 


per PAQ = per A for all permutation matrices P and Q of 
order n. 


per A? = per A. 
per cA = c"per A for all scalars c. 
If A has a row (or column) of all zeros, then per A = 0. 


If some row (or some column) of A is multiplied by a scalar 
c, then the permanent of the resulting matrix equals cper A. 


per P = 1 for every permutation matrix P of order n. In 
particular, per I, = 1. 


If A= B®C, where B and C are square matrices, then 
per A = per Bper C. More generally, if 


BO 
[x 2) 


where A and B are square matrices, then per A = 
per BperC. 


(Laplace expansion by a row or column) 


per A = X ajperA;; (j =1,2,...,n) 
j=l 


and 
perA = X ajperA;; (i=1,2,...,n). 


i=1 


(Recall that A,,; is the matrix of ordern — 1 obtained from 
A by striking out row i and column j.) 


Example 9.3.2 Let 
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Then per A = ad + bc. Let 
abo 
B=|0 c d 
e 0 f 


Then it is easy to see that in the permanent of B there are at most 
two nonzero terms, namely, acf and bde. Hence per B = acf +bdf. 


Now let 
T22 | 


gal an 
EEL) 


Then each of the 3! = 6 terms in the permanent of C is nonzero, 
and adding them we obtain that 


perC =14+24124+24+642=25. 


Let D be the matrix obtained from C by adding the second row 
to the first row. Then 


4 3 3 
DS 3 1 1 
1 2 1 


A simple calculation shows that per D = 45. This example shows 
that the elementary row operation of adding a multiple of one row 
to another row can change the value of the permanent. In the 
case of the determinant, we have det D = det C. The fact that 
such elementary row (or column) operations can alter the perma- 
nent leads to the general computational difficulty in evaluating the 
permanent. 


Let A = [a;j] be a square matrix of order n. Recall that the 
König digraph G(A) of A has n black vertices corresponding to the 
rows of A and n white vertices corresponding to the columns of A, 
and an edge from black vertex i to white vertex vertex j of weight 
aij. Recall (see Section 4.4) also that a collection F of n edges, one 
leaving each black vertex and simultaneously one entering each 
white vertex, is a 1-factor (or perfect matching) of G(A) and its 
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weight w(F’) is the product of weights of these n edges. If F(A) 
denotes the collection of all 1-factors of G(A), then it follows that 


per A= >> w(F), 
FEF(A) 


the sum of the weights of all the 1-factors of G(A). 

As usual, when a;; = 0, we can consider that there is no edge 
from black vertex 7 to white vertex j, that is, weight equal zero 
is interpreted as the absence of an edge, and thus edges of weight 
zero are not part of any 1-factor. Now consider the special case 
where each entry of A equals 0 or 1, that is, A is a (0, 1)-matrix. 
Then w(F) = 0 or 1 and so, with our convention, the permanent 
of A counts the number of 1-factors of A. Thus the permanent is 
a counting function and, indeed, one of some significance. 

Another way to view the permanent of a (0,1)-matrix A is 
as the number of permutation matrices P of order n such that 
P < A (entrywise). This is so because each such permutation 
matrix P < A corresponds to a 1-factor of weight 1 and viceversa. 


Example 9.3.3 Let A be the matrix of order n having 0’s ev- 
erywhere on its main diagonal and 1’s everywhere off the main 
diagonal. Thus A = J,, — In, where J, is the matrix of order n 
of all 1’s. The permutation matrices P with P < A (entrywise) 
correspond to those permutations i,i2...i, of {1,2,...,n} such 
that ip 4 k for k = 1,2,...,n. Such permutations are called de- 
rangements (of order n) since in such a permutation matrix, no 
integer is in its natural position (the natural position for integer 
k is position k, and this is precluded under the assumption that 
ik # k). The number of derangements is denoted by D,, and thus 
we have per (J, — In) = Dn. We easily calculate that Dı = 0, 
D, = 1, and D3 = 2 (the permutations 2,1,3 and 3,1,2 are the 
derangements of order 3). The inclusion-exclusion formula (see 
Section 1.3) can be used to count the number of permutations of 
order n as follows. Let X; be the set of permutations 7122...%n of 
{1,2,...,n} in which i, = k (k =1,2,...,n). Then the derange- 
ments are those permutation in the intersection X N Xa N- - -NA Xn 
of the complements of the X;’s in the set of all permutations of 
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{1,2,... n}. Thus 


per (Jn — In) = |XınXan---NX,| 


n 


= 3 (DI Nex Xi 


k=0 KC{12...n};|K|=k 


= >.(-1)* e IX, XN- AO Xz| 
k=0 


Here we have used the fact that |N;cx X;| depends only on the 
cardinality k of K and thus equals |X, NM X2 N ---M X;|. Since 
|X, X2 N --- N X;,| counts the number of permutations of the 
form 12... kip41..-%n, its value is (n — k)!. As an example of this 
formula, we calculate that 


1 1 1 1 


The resemblance of the permanent to the determinant natu- 
rally leads one to the question of whether it might be possible to 
use the determinant in order to calculate the permanent. 


a b 
S 
and let A’ be the matrix obtained from A by attaching a minus 
sign to the entry b: 
,_ |a —b 
A = | en | | 


Then det A’ = ac — (—b)d = ac + bd = per A. Thus, attaching the 
minus sign to b converts the determinant into the permanent; the 


Example 9.3.4 Let 
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determinant of the resulting matrix A’ equals the permanent of 
the original matrix A, no matter what the values of a,b,c, and d. 
In other words, the identity det A’ = per A is an algebraic identity. 
Now let 


abe 
A=|Idef 
ghi 


Can we attach minus signs to some of the entries of A in order 
to convert the determinant into the permanent? The matrix of 
order 3 has the property that, in the classical formula for the de- 
terminant, both the even permutations and the odd permutations 
partition the entries of A: 


(even permutation terms) aei, bfg, and cdh; (9.7) 


(odd permutation terms) ceg, bdi, and ahf. (9.8) 


If we are to convert the determinant into the permanent, then the 
terms in (9.7) must each have an even number of minus signs in 
them while the terms in (9.8) must each have an odd number of 
signs in them. Since the sum of three odd numbers is odd and the 
sum of three even numbers is even, this is impossible. Thus the 
determinant of the general matrix of order 3 cannot be converted 
into its permanent. 

Now suppose we assume that c is identically zero. Thus A now 
takes the form 


a b 0 
A=|-d-e f |; 
ghi 
and the permanent of A satisfies 


per A = aei + bfg + bdi + ahf. 


Let 
a —b 0 
A=|d e -f 
g h i 


Then calculating the determinant of A’ gives the algebraic identity 
det A’ = aei + bfg + bdi + ahf = per A. (9.9) 
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Thus, affixing minus signs to b and f converts the determinant 
into the permanent. Because (9.9) is an algebraic identity, we can 
reformulate our discussion as follows. Let 


1 1 0 
B=|1 1 1 
1 1 1 


There are four nonzero terms in the determinant and permanent 
of B. All these terms in the permanent have value 1. In the deter- 
minant, two have value 1 (corresponding to the even permutations 
1,2,3 and 2,3, 1) and two have value —1 (corresponding to the odd 
permutations 2,1,3 and 1,3,2). To convert the determinant into 
the permanent we need to change some of the 1’s to —1’s in order 
that all the nonzero terms in the determinant now have value 1 
(and so no cancellation occurs). This is accomplished with the 
matrix 


1 27.0 
B=|lı 1-1], 
i g 1 


where 
det B' = 4 = per B. 


The examples above lead to the following definition. 


Definition 9.3.5 Let A’ be a (0,1, —1)-matrix, that is, a matrix 
each of whose entries is 0, 1, or —1 with at least one nonzero 
term in its classical determinant expansion. Let A be the matrix 
obtained from A’ by replacing each of its —1’s with 1’s. Then A’ 
is a sign-nonsingular matrix (abbreviated SNS-matrix) provided 
that det A’ = +per A. If A’ is an SNS-matrix, then, in evaluating 
the determinant of A’ using the classical expansion, there can be 
no cancellation of nonzero terms; either all the nonzero terms equal 
1 (so det A = per A) or all the nonzero terms have value —1 (so 
det A = —per A). 

Proceeding in the other direction we get the following. Start 
with a (0, 1)-matrix A = [a;;] of order n such that the permanent 
of A is not zero (the König digraph has a perfect matching). If an 
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SNS-matrix A’ can be obtained from A by changing some of the 
1’s of A to —1’s, then det A’ = +per A, and we have succeeded in 
converting the determinant into the permanent, or we might better 
say, that we have succeeded in converting the permanent of A into 
a determinant. We note that in case we get det A’ = —per A, then 
by multiplying the entries of A’ in row 1 by —1, we obtain an 
SNS-matrix A” with det A” = per A. As a result we can ignore, 
with no loss in generality, the possibility that det A’ = —per A. 


Example 9.3.6 The following matrices are SNS-matrices: 


1 —1 0 
ily =i TF 1 wet 0 
E ee ae S a 
1 1 1 1 


The example of order 4 shows that the permanent of the matrix 


oro 
ro 
Hrmo 
HHroo 


can be converted into a determinant; the permanent equals 8. 
Moreover, because we get no cancellation of nonzero terms in the 
determinant, we obtain the algebraic identity that for 


a bo 0 
c de OQ 
le ae 
Ir Bb 
and 
a-b 0 0 
j c d —e 0 
ASIE ge 
j k l m 
we have 
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We collect some elementary properties of SNS-matrices in the 
following lemma. 


Lemma 9.3.7 Let A be an SNS-matrix of order n. Then the 
following hold: 


(i) A has a nonzero term in its classical determinant expansion. 
(ii) AT is an SNS-matriz. 


(ii) If P and Q are permutation matrices of order n, then PAQ 
is an SNS-matrix. 


(iv) Every matrix obtained from A by multiplying some rows and 
columns by —1’s is an SNS-matrix. Equivalently, if Dı and 
Ds are diagonal matrices with only 1’s and —1’s on the main 
diagonal, then DAD» is an SNS-matrix. 


We now discuss an important connection between SNS- 
matrices and digraphs. Let A = [a;;] be a (0, 1, —1)-matrix of order 
n. We consider under what circumstances A is an SNS-matrix. In 
order for A to have a chance of being SNS, it must have a nonzero 
term in its determinant expansion (see (i) of Lemma 9.3.7). By 
(iii) of Lemma 9.3.7, we may assume that this nonzero term is the 
product of the entries on the main diagonal, that is, all entries on 
the main diagonal of A are nonzero. By (iv) of Lemma 9.3.7, we 
may further assume that all entries on the main diagonal equal 
—1. With these assumptions, we now consider the weighted di- 
graph D(A). Each edge of D(A) has weight +1 (as we often do, 
we ignore edges of weight 0). 


Theorem 9.3.8 Let A = [a] be a (0,1,—1)-matrix of order n 
each of whose entries on the main diagonal equals —1. Then A is 


an SNS-matrix if and only if the weight of each cycle in the Coates 
digraph D*(A) equals —1. 


Proof. We refer to the definition of the determinant given in 
(4.1): 
det(A) = (-1)" X (-1)Mw(L), (9.10) 


LEL(A) 
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where the summation extends over all linear subdigraphs Z of the 
Coates digraph D*(A); we may restrict this summation to those 
Ls for which w(L) # 0. The term corresponding to the linear 
subdigraph consisting of n loops, one at each vertex, of weight —1 
(corresponding to the identity permutation 1,2,...,n) equals 


(-1)"a11Q22 -ann = (—1)"(-1)” = (—1)” =1. 


The matrix A is an SNS-matrix if and only if all nonzero terms 
(—1)“w(L) in the summation (9.10) equal 1. 

First suppose that A is an SNS-matrix. The weights of the 
cycles of length 1, the loops, equal —1 because A has all —1’s on 
its main diagonal. Let y be a cycle of length k > 2, and let Z be 
the linear subdigraph whose cycles are y and the n — k loops at 
the vertices are not contained on y. Then c(L) = 1+n-— k and 
w(L) = w(y) - (-1)""*. Because A is an SNS-matrix, we have 


1 = (-1) w(Z) = Yo) N = (~1)w(7). 


Thus w(y) = —1 for every cycle of length at least 2. 

Now assume that the weight of each cycle equals —1. Let L 
be a linear subdigraph of D*(A) with k cycles (including cycles of 
length 1). Then c(L) = k and w(L) = (—1)*. Hence 


Hence A is an SNS-matrix. 


Example 9.3.9 Let 


-1 1 0 0 

0 —1 1 1 

a —1 —1 -1 1 
—1 0 0 -i 


Each entry on the main diagonal of the matrix A equals —1, and 
hence Theorem 9.3.8 applies. The digraph D(A) is pictured in 
Figure 9.1. One easily checks that each cycle of the digraph D*(A) 
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has weight —1. Thus A is an SNS-matrix and its determinant 
equals the permanent of the matrix 


eR OF 
Orr 
OrrFO 
Ee HO 


The common value is 5. 


Figure 9.1 


It is possible that a (0,1)-matrix (or (0,—1)-matrix) be an 
SNS-matrix. Of course /„ and —/,„ are SNS-matrices, as are P 
and —P for every permutation matrix P. A nontrivial example is 
the circulant 


1101000 
0110100 
0011010 
0001101 
1000110 
0 11000 11 
101000 1 


whose permanent and determinant both equal 24. Replacing the 
1’s in this matrix by any numbers whatsoever results in a matrix 
whose determinant equals its permanent. 
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The permanent is an important and fundamental combinatorial 
function for which we have given a graphical interpretation. It 
motivated for us the important notion of an SNS-matrix. More 
about SNS-matrices and related topics can be found in [10]. 


9.4 Exercises 


1. Let A be a matrix of order m, and let B be a matrix of order 
n. Let the eigenvalues of A be A1,A2,...,Am, and let the 
eigenvalues of B be u1, 42, : . . , Un. Prove that the eigenvalues 
of aA Q In + bln 8 B are ad; + by; (LSi<SmI1<Sj<n). 


2. Let A and B be as in Exercise 1. Prove that A and B do not 
have a common eigenvalue (a number that is an eigenvalue 


of both A and B) if and only if 


det(A & In — Im Q B) £0. 


3. Determine the Gersgorin region for each of the following ma- 


trices: 

1 1 
@a=|] e 

1 0 1 
(b) A=|1 -3 1 

2 0 6 

1+2 2 —1 
(c) A= 3 2 1 

2 1-i 2-i 


4. Let A = [a;;] be a matrix of order n, and let D be a diagonal 
matrix of order n with positive diagonal entries d1, do,..., dn. 
Apply Gersgorin’s theorem to D~' AD to obtain an inclusion 
region for the eigenvalues of A. 


5. Prove Theorem 9.2.5. 


6. Prove Corollary 9.2.9. 
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10. 


11. 
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. Verify the properties of the permanent in Lemma 9.3.1. 


. Find a formula for the permanent of A = (a — b)I, + bJn 


in terms of the derangement numbers. (Here, as before, Jn 
denotes the matrix of order n each of whose entries equals 
1.) 


. Compute the permanent of the Hessenberg matrix H = [h;;] 


of order n defined by 


poal MS Ged, 
~~) 0 otherwise. 


Thus A, is the matrix 


1100 

1 1 1 0 

1 1 1 1 

111i 

Use Theorem 9.3.8 to determine whether or not the matrix 

101100 
111000 
01100 1 
010100 
100010 
0000 1 1 


is an SNS-matrix. 


For each of the following matrices, show how to affix minus 
signs to some of the 1’s so that the matrix becomes an SNS- 
matrix: 


0 110 1 10 1 
1 0 1 1 1110 
1 101?JO 1 1 1 
1 110 1 0 1 1 


Chapter 10 


Applications 


This chapter is intended for those for whom mathematics is pri- 
marily a tool to describe and understand phenomena in other sci- 
entific disciplines. Because the applications of matrices and graphs 
in science are so numerous, we can present only a few selected ex- 
amples. We shall focus on several topics where it is possible or 
even necessary to use the combinatorial approach developed in 
this book. 


The three sections of this chapter describe some applications 
in electrical engineering, physics, and chemistry. These sections 
should not be considered as introductions or as surveys of these 
fields. In each section we assume a certain familiarity with the 
problems considered, with only very short explanations of back- 
ground and specific terminology! given. However, we shall al- 
ways give references to relevant general books where the interested 
reader can find more information. We also assume that the reader 
is acquainted with the fundamentals of basic mathematical analy- 
sis, in particular with ordinary and partial differential equations. 


‘Sometimes the terminology of a specific field conflicts with the terminol- 
ogy used in other chapters of this book. 
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10.1 Electrical Engineering: 
Flow Graphs 


Applications of matrix theory in electrical engineering are numer- 
ous and varied. The matrices that appear in this area are usu- 
ally sparse, i.e., contain a lot of zero entries. This fact justifies 
great popularity of graph-theoretical, i.e., combinatorial, methods 
in matrix theory among electrical engineers. 

Electrical engineers have developed a series of methods for solv- 
ing systems of linear algebraic equations, which appear in the the- 
ory of electrical circuits, control theory, and other areas. These 
methods use flow graphs (Coates [13], [26], [15]), signal flow graphs 
(Mason [59], [60], [80]) and Chan graphs (Chan and Mai [16], [14]); 
the first two graphs are described in Chapter 6. For more recent 
treatments, as well as for backgrounds, see, for example, [57] and 
[28]. 

It is noteworthy that the mentioned graphs (and especially, 
Mason’s signal flow graphs) give a better insight into the physical 
system being described than the corresponding system of equa- 
tions does. The signal flow graph technique is very effective and 
therefore popular among engineers. It was, in fact, first developed 
during the Second World War as an aid in designing weapon con- 
trol systems by Shannon [73] but remained unknown to the public 
for many years. 

One of the basic problems in the theory of electrical circuits? 
is to determine currents in all branches of a given electrical cir- 
cuit when the voltages of electrical generators are given. For this 
purpose one uses the Kirchhoff Voltage Law (KVL), which says 


?An electrical circuit is an interconnection of some two terminal compo- 
nents called branches. Branches are connected by their terminal points, called 
nodes. Electrical circuits have graphs as a natural mathematical model. Nodes 
become vertices and branches become edges of the associated graph. Depend- 
ing on the problem, the associated graph can be undirected or directed. In 
the later case, an orientation is (in an arbitrary way) associated with each 
branch and (the same orientation) with the corresponding edge. 
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that the algebraic sum of voltage drops around any loop? is equal 
to zero. Applying KVL to several loops we get a system of lin- 
ear algebraic equations with loop currents as unknowns. Graph 
theory helps to find a maximal set of independent loops ensuring 
that the obtained system of equations will consist of independent 
equations sufficent to determine all currents. 


Example 10.1.1 We determine the current in the branch with 
resistance Rs in the circuit of Figure 10.1. 


We number the five loops in Figure 10.1 from 1 to 5 from left to 
right and orient them in a counterclockwise fashion. The equations 
for loop currents are 


Rı Rs Rs Ry Ro 


Figure 10.1 


Rul, -Rızla = — F), 
—Raılı +R —Rogls =U, 
—Rala +Ra3l3 —Rs4l4 =U; 

—Raglz +Ryulaı Res’ =0, 

—Rsalg +Rssls = Eo, 


where 
Riu = Rı + Ro, Ra = Rə + Rs + Ra, R33 = R4 + Rs + Re, 


Ryu = Re + Ry + Rs, Rss = Rg + Ro, 
Ry = Ra = Ro, Rox = Ra = Ra, 


3Here a loop means a subgraph that is reduced to a cycle if we neglect 
orientations of edges. 
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R4 = Ras = Re, Ras = R54 = Rs. 


Instead of explicitly writing this system of equations, we can im- 
mediately construct the corresponding Coates digraph (routine, 
after some practice). In our case the Coates digraph is given in 
Figure 10.2. 


0 
Figure 10.2 


The current I through Rs corresponds to vertex 3 and can be 
immediately obtained by using Coates formula (6.15): 


N 
IE = D’ 
where 
N = ERa Rs2R44 Rss — Er Ro Rg2 Ras R54 + 
EaRy;Ra4RıaRaı — Ea R45 R34 R11 Roe 
and 


D = RıaRaı R33 R44 R55 Fa Ry Roi Rg4Ra3Rs5 as Ry Ro R33 Ras Rs4+ 


Ri Ro3R32RaaR55 — Ry Ro3R32Ra5R54 + Rıı RoR34Ra3Rs55+ 
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Rı 1 Ras Rz3R45 R54 7 Ry 1 Ro R33 Ry R55 x 


In control theory, where systems and signals are the main ob- 
jects, we usually encounter mathematical models that reduce to 
systems of ordinary linear diferential equations with constant co- 
efficients. If to such a system we apply the Laplace transforma- 
tion, we get a system of linear algebraic equations that is to be 
solved. Instead of time dependent functions, which represent volt- 
ages, currents, etc. (signals in general), the equations now contain 
the Laplace transforms of these functions. Matrices of these sys- 
tems are typically sparse, and again we are in a position to apply 
techniques of combinatorial matrix theory, in particular signal and 
signal flow graphs. As their name indicates, signal flow graphs are 
specially designed to visualize and enable an easy analysis of the 
signal flow through complex systems of control theory. More in- 
formation can be found, for example, in [5] and [28]. 


Example 10.1.2 For the two-terminal network of Figure 10.3, 
determine the transfer function G(s) defined as the ratio of the 
Laplace transforms of the output U,(s) and imput signal U, (s). 


RU, -R U; R 


Figure 10.3 
We have the following equations: 


aal 1 


h(s) = Uls) — V2(s)), Uls) = G(s) - Lets), 
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h(s) = G(Uals) — Us(s)), Us(s) = Goals) - 169) 
B(s) = =(Ua(s) - Vils), Url) = —B(s). 


R Cs 


The corresponding signal flow graph is given in Figure 10.4. 


Figure 10.5 


There exists just one path from vertex U;(s) to vertex U4(s). 
Its weight is pı = PES All cycles touch this path, and for the 
corresponding determinant we get A; = 1. 

There are five cycles, denoted by 1,2,3,4,5, in Figure 10.4, each 
of which has weight ar: There are exactly six pairs of cycles 
that do not touch each other, namely, cycles 1 and 3, 1 and 4, 1 
and 5, 2 and 4, 2 and 5, and 3 and 5. There is only one triple 1, 
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3, and 5 of mutually nontouching cycles. Four or more cycles that 
mutually do not touch each other do not exist. Using Mason’s 
formula we get 


1 
Bere 1 
G x, R>C3s3 al 
= Tee. me 
1 


7393 + 5r2s2 + 6rs +1 


Example 10.1.3 For the system in Figure 10.5 we immediately 
get the corresponding signal flow graph in Figure 10.6. 


Gs 


Figure 10.6 


There are two paths from vertex R(s) to vertex C(s). They 
have weights 


Pi = G1G2G3G4, P2 = G1G5G4. 
The determinants of the corresponding subdigraphs are A, = 
As = 1. The three cycles in the digraph do touch each other. 
The transfer function reads 
C(s) _ G1G2G3G4 + G1ıGı4G5 
R(s) 1+ G3G4H, + G1G2G3G4H2 + G1G1G5 Ha 
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The last example demonstrates some useful features of the sig- 
nal flow graph technique: the variables in the equations of the 
system represented by a signal flow graph represent real signals in 
a technical system! The equation corresponding to the ith vertex 
of a signal flow graph (see (6.16)) says that the signal at the vertex 
i is equal to the sum of signals from other vertices via incoming 
edges multiplied by weights* of the edges and thus corresponds to 
the physical reality. In this way, a signal flow graph is, in fact, a 
simplified block diagram of a technical system (compare the block 
diagram of Figure 10.5 with the corresponding signal flow graph 
in Figure 10.6). 


10.2 Physics: Vibration of 
a Membrane 


Of the many applications of matrix theory to physics, we con- 
sider only one in which the graph-theoretic approach to the ma- 
trix theory involved is dominant—the problem of the vibration of 
a membrane. 
In the approximate numerical solution of certain partial differ- 
ential equations, graphs and their spectra arise quite naturally. 
Consider, for example, the partial differential equation 


@z O72 
— +— +z =0 10.1 
22 + Op + rz (10.1) 
o?z A 
(or Az + Az = 0; where A = — + —— is the Laplace operator). 
Ox? 0y? 
Here the unknown function z = z(x, y) is subject to the boundary 
condition z(x,y) = 0 on a simple closed curve ¢’ lying in the 


xy-plane. It is known that equation (10.1) has a solution only 
for an infinite sequence Aı < A2 < ... < An < ... of (discrete) 
values of A, which are called the eigenvalues of the equation. The 
sequence of eigenvalues is called the spectrum of the equation, and 
the solutions of (10.1) are the corresponding eigenfunctions. 


“In applications the weights of edges are called transmittances. 
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In an approximate determination of z we consider the val- 
ues only for a set of points (2;,y;) that form a regular lattice 
(square, triangular, or hexagonal) in the xy-plane. A correspond- 
ing (infinite) graph can be associated, in a straightforward and 
natural way, with this lattice. Points (x;,y;) are the vertices of 
the graph and the edges connect pairs of points of minimal dis- 
tance. The points (respectively, vertices) lying in the interior of T 
are called internal points (respectively, internal vertices), and the 
other points (respectively, vertices) of the lattice are called exter- 
nal. Let zi = z(x;,y;). Because of the boundary condition, we can 
take z; = 0 for all external points. 


Figure 10.7 


In the case of a square lattice (Figure 10.7), let zọ = 2(Zo, yo), 
(Xo, Yo) being a fixed point of the lattice, and let z1 = z(xo+h, yo), 
29 = Z(£o —h, yo), 23 = 2(20, yo + h), and z4 = z(£o, yo — h) be the 
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values of z for the neighboring points (we assume that the points 
of the lattice lie on lines that are parallel with the coordinate axes 
and that the distance between any two neighboring points is h). 


o?z 
The value of — + — at the point (20, can, as usual, be 
ðr? | Oy? P (xo, Yo) 
approximated by 


1 
ya + Z2 + 23 +24 — 429). 


Equation (10.1) then becomes 


1 
m + Z2 + 23 + z4 — 4z0) + Az = 0, or 


(4 = Ah?) zo = 21 + Z2 + 23 + Z4. (10.2) 


Now let the internal points be labeled by 1,2,...,n. Taking 
v = 4 — àh? and writing the equations corresponding to (10.2) for 
all internal points (2;,y;), i = 1,2,...,n of the lattice, we obtain 


pa = X ze eben), (10.3) 
ji 


where the summation is taken over all indices j; corresponding to 
internal points (&,,, y; ) neighboring («;, yi). 

It is not necessary to include in the sum (10.3) those external 
points neighboring (x;, y;) if the value of z for this point is zero. Let 
G be the subgraph of the lattice graph induced by the internal ver- 
tices. If we interpret v as an eigenvalue of G and (21, 22,..., Zn)” 
as the corresponding eigenvector, we see that (10.3) just defines 
the eigenvalue problem for G. The graph G will be called the 
membrane graph. 

If v; are the eigenvalues of G, the approximate eigenvalues 

Zp 
2 
eigenvalues of G represent an approximate solution of (10.1). Note 
that the AF (i = 1,2,...,n) do not necessarily represent approxi- 
mate values for the first n eigenvalues A1,..., A, of (10.1), but for 
some eigenvalues A;,,...,Ai 


of equation (10.1) are given by A} = . The corresponding 


nt 
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The adjacency matrix A of the graph G is a sparse matrix. For 
an arbitrary large n, the number of nonzero entries in any row or 
in any column is not greater than the vertex degrees in the lattice 
(i.e., 4, 6 and 3 for the square, triangular and hexagonal lattices, 
respectively). Therefore, such matrices can be treated, at least in 
principle, by those methods described in Section 6.5. The digraphs 
D(A) and D*(A) are identical because A is a symmetric matrix 
and a modification of them is just the membrane graph. 

For the triangular and hexagonal lattices (see Figure 10.7), we 
have, respectively, the following approximate expressions for Az 
in the point (29, yo): 


— (21 + 22 + 23 + z4 + 25 + 26 — 620), 


4 
372 A + 22 + 23 — 329). 


We again obtain (10.3), but now the connection between the 

eigenvalues of G and of (10.2) is given by AF = sa and 
«.43-v; 

Fa | | | 
The procedure described for approximately solving a partial 

differential equation is often used in technical problems (see, for 

example, [18]). In this way the theory of graph spectra can be 

very useful in practical calculations. 

The most interesting problem that can be treated by such a 
procedure is that of membrane vibration. There are some other 
problems of the same kind, for example, air oscillations in space, 
etc. (see [19], [18], [51], and [69]). These problems motivated the 
authors of [19] to consider graph spectra. 

If a vibrating membrane Q is held fixed along its boundary I, 
its displacement F(x, y,t) in the direction orthogonal to its plane 
is a function of the coordinates x,y and time t and satisfies the 
wave equation 


, respectively. 


2 2 2 
el a ER 


—— — En + 
ot? 08° Oy? 
where c is a constant depending on the physical properties of the 
membrane and of the tension under which the membrane is held. 
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The solutions of the form F(z, y,t) = z(z, y)e™ are of partic- 

ular interest. If we substitute this expression in (10.4), we obtain 
O2(x, Ö?z(x, 

a (zy) aa), 


-w°z(x,y) = (10.5) 


Ox? Oy 


2 
Setting A = = reduces (10.5) to (10.1). 

The repre entan of a membrane by a graph is by no means a 
mathematical abstraction. Equation (10.1) describes the vibration 
of a membrane. The membrane is represented by a continuous 
model. If the membrane is described by a discrete model as given 
below, we arrive at the system (10.3) obtained in the approximate 
solution. 

According to the discrete model, the membrane consists of a 
set of atoms that in the equilibrium state, lie on the vertices of 
a regular lattice graph embedded in a plane. Each atom acts 
on its neighboring atoms by elastic forces. We assume that all 
atoms have the same mass and that elastic forces are of the same 
intensity for all neighboring pairs of atoms. If z;(t) and z;(t) are 
displacements of neighboring atoms 7 and j at time t, the elastic 
force tending to reduce the relative displacement between these 
atoms is 


Fj = —K (a(t) — z(t), 


where K is a constant characteristic of the elastic properties of the 
membrane. 
The equation of motion of the kth atom is 


d’z,(t) Iy 
dt2 m, (zr(t — Zją( t)), (10.6) 


where m is the mass of an atom and where the summation is taken 
over the nearest neighbors jr of the kth atom. For a vertex j of 
the lattice graph in which there is no atom of the membrane, we 
have z;(t) = 0 (as before, such vertices are called external). 

We can again consider pure harmonic oscillations and take 
z(t) = ze" (where i = \/—1). If we insert this expression into 
(10.6) for each atom k, then we again obtain the graph eigenvalue 
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problem (10.3). Thus, a solution of the discrete model is equivalent 
to an approximate solution of the continuous model. (Of course, 
in some cases the whole thing can be considered the other way 
around; the continuous model could give an approximate solution 
for the discrete model). 

We conclude this section with the following observation. If the 
problems of continuous mathematics (analysis) are to be solved 
by means of computers, they must be approximated by the cor- 
responding discrete models since computers operate with discrete 
actions. They change their state in discrete moments and an in- 
ner state of a computer is determined by the states of a finite 
number of computer cells where the number of states of any cell is 
finite. Therefore numerical mathematics represents a link (a union 
of sorts) of the continuous and discrete. 


10.3 Chemistry: Unsaturated 
Hydrocarbons 


In this section? we present a specific chemical application of matrix 
theory and graph theory. The applications of matrices and graphs 
in chemistry (especially in physical and theoretical chemistry) are 
so numerous that it is impossible to give any reasonable survey 
in a limited space. The interested reader may consult [50]. Our 
discussion is in four parts. 


Hückel Molecular Orbital Theory 


One of the basic goals of quantum chemistry is to describe the 
electronic structure of molecules. This can be done by solving the 
Schrodinger equation 

AW, = EV, (10.7) 
where H is the Hamiltonian operator (or energy operator), Y; is 
the wave function of the system under consideration, and Æ; is 
the energy of the system. The subscript 7 indicates that in the 
general case a Schrodinger equation has more than one solution. 


5This section is based on a chapter of [23] that was written by I. Gutman. 
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The wave function W, fully describes the jth state of the system 
whose Hamiltonian operator is H. 

If the wave function describes the state of an electron in 
a molecule, then it is called a molecular orbital. The phys- 
ical meaning of a molecular orbital V = W(z,y,z) is that 
|W (x, y, z)|?dax dy dz is the probability of finding the pertinent elec- 
tron in the volume element dV = dz dydz at the point with the 
space coordinates 2, y, 2. 

The Hamiltonian operator requires (among other operations) 
the calculation of the second partial derivatives with respect to 
the space coordinates x, y, and z. Thus, the Schrodinger equation 
is a second order partial differential equation. Under certain con- 
ditions (which we will not specify here), the differential equation 
(10.7) can be transformed into matrix form: 


where now H is a Hamiltonian matrix and Y; is the wave function 
in vector form. From (10.8) it is evident that Y; is the eigenvector 
and E, is the eigenvalue of the matrix H. 

In order to solve the Schrödinger equations (10.7) and (10.8) 
for complicated many-electron molecular systems, various approx- 
imations are used. In the pioneering days of quantum chemistry 
(in the 1930s and 1940s) an approximate method for describing 
the state of single electrons in conjugated hydrocarbons was de- 
veloped, known under the name Hückel molecular orbital theory.® 

Within the framework of the Hückel method, the Hamiltonian 
matrix H = [h,j] is a square matrix of order n, where n is the 


6 Hydrocarbons are chemical compounds composed of only two elements— 
carbon (C) and hydrogen (H). A hydrocarbon is saturated if its molecules 
possess only single bonds. If in a molecule there are also multiple bonds, then 
the hydrocarbon is unsaturated. An important class of unsaturated hydrocar- 
bons is the conjugated hydrocarbons, each of whose carbon atoms participates 
in exactly one double bond. We assume that in a hydrocarbon molecule all 
carbon atoms have valency 4 and all hydrogen atoms have valency 1. 

The Hiickel graph [42] is used for an abbreviated representation of conju- 
gated hydrocarbons. Its vertices represent only the carbon atoms, and all its 
edges are simple (irrespective of whether the corresponding chemical bonds 
are single or double). The vertices of a Hiickel graph may be of degree 1,2, or 
3. 


10.3. CHEMISTRY: UNSATURATED HYDROCARBONS 231 


number of carbon atoms in the molecule. Let these carbon atoms 
be labeled by 1,2,...,n. Then the matrix elements hps are given 
by 


amas rS We Derr 
B ifr Æ s and the atoms r and s are chemically bonded 
0 ifr Æ s and no chemical bond between the atoms 

r and s exists. 

(10.9) 

The parameters a and ß are called the Coulomb and the resonance 
integral; in Hiickel theory these are assumed to be constants. The 
approximations imposed by the relations (10.9) are severe. There- 
fore it is surprising that the results of the Hiickel theory are (at 
least sometimes) in good agreement with both experimental find- 
ings and other, more advanced, theoretical approaches [67]. 

For example, for the hydrocarbon styrene (I, Figure 10.8) the 
Hückel-Hamiltonian matrix has the form 


oooooonN 
SOOOoOoooaßa a 
VDoooun ss a oO 
ooouüsanooa 
ooud I DD O0O00 
SE 8 So. 
WLRDBWBOCCCS 
LLDD 


Keeping in mind relations (10.9), we see that the Hückel- 
Hamiltonian matrix can be presented as 


H =al, + BA, (10.10) 


where A is a symmetric matrix whose diagonal elements equal 
0 and whose off-diagonal elements equal 1 or 0, depending on 
whether the corresponding atoms are connected or not. In fact, 
A= Ay is just the adjacency matrix of the Hückel graph. Graph 
II in Figure 10.8 is the Hückel graph of styrene I. Equation (10.10) 
immediately gives the following result. 
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N3 n NL 
De, 
H 
I II 


Figure 10.8 


Theorem 10.3.1 IfX is an eigenvalue and z is an eigenvector of 
the matrix A, then a+ BA is an eigenvalue and z is an eigenvector 
ofthe matrix H. 


From this theorem it follows that the Hückel molecular orbitals 
WV, coincide with the eigenvectors z; of the adjacency matrix of the 
Hückel graph, that is, Y; = z;. The eigenvalues A; of the matrix 
Ay and the energies E; of the corresponding electrons are related 
simply as 

Ej =Q + Age 
There are exactly n different molecular orbitals, namely, the z; for 
j=1,2,...,n. 

This important conclusion shows that there is a deep and far- 
reaching relation between the Hückel molecular orbital theory and 
graph spectral theory. The Hiickel theory provides an important 
field of application of the graph spectra. 

For more information on Hückel theory the interested reader 
can consult, for example, [2], [21], [27], [40], [36], [77], and [24]. 


Two Examples: Linear Polyenes and Annulenes 


We are now going to determine the characteristic polynomials 
and spectra of two important Hückel graphs, namely, those asso- 
ciated with linear polyenes and annulenes, as given in Figure 10.9 
for n = 8. The Hückel graphs of these compounds are given in 
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Figure 10.10 and are, respectively, the path P, with n vertices and 
the circuit Cn of length n. 


ea 
H—C==C—C—=C —C—=C—0=C—H 


n = 8 C’s 
H H 


Be j 


s | 
ne 
bon 


Figure 10.9 
U 
Un U2 


) = ) ) Un— 1 U3 


Figure 10.10 


We prove first some general results [41], [42]. In Section 8.5, 
the characteristic polynomial p(W, A) of a forest W with n vertices 
was shown to satisfy the formula 


a 
pW, A) = S2(-1)k'm(W, k) A" (10.11) 


k=1 


IL 
2 
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where m(W, k) is the number of k-matchings in W. 

Let G be a graph with n vertices v1, v2,...,Un and let e,, be an 
arbitrary edge of G connecting the vertices v, and vs. The graphs 
G-e,, and G—v,—v, are obtained from G by deleting, respectively, 
the edge e,, and vertices v, and v, (and all their incident edges). 


Lemma 10.3.2 For an arbitrary graph G, we have 
m(G,k) = m(G — ers, k) + m(G — v, — vs, k — 1). (10.12) 


Proof. The k-matchings in G are of two types: the edge e,, is 
(i) in the k-matching or (ii) is not in the matching. The number 
of k-matchings of type (i) is the number of (k — 1)-matchings of 
G — v, — vs, and thus equals m(G — v, — vs, k — 1). The number of 
k-matchings of type (ii) is the number of k-matchings of G — e,s, 
and thus equals m(G — ers, k). The lemma now follows. 


Combining equations (10.11) and (10.12), we obtain the fol- 
lowing result. 


Theorem 10.3.3 The characteristic polynomial of a forest W 
satisfies the recurrence relation 


p(W, A) = p(W — ers, A) + p(W — v, — vs, A) 


where ers is an arbitrary edge of W that connects the vertices vr 
and Vs. 


We now apply Theorem 10.3.3 to the edge connecting the ver- 
tices v, and v„_ı of the path P, in Figure 10.10. Pa — en n-1 is the 
graph with connected components P,_ı and P}. Therefore 


DL, — €n,n-1) A) — p(Pı, A)p(Pa-ı; A) = Ap(Pn-1; A). 


In addition, Pa — Vn — %n_-ı is the path with n — 2 vertices. Thus 
from Theorem 10.3.3 we obtain 


p(Pn, A) = Ap Pra) A) ~~ p( Pras A), (10.13) 


from which it is easy to recursively evaluate the polynomials 
p(Pn, A) starting with p(P,,\) = A and p(P,A) = à? — 1. 
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In the theory of special functions, the Chebyshev functions 
T,(A) of the first kind and the Chebyshev function U„(A) of the 
second are investigated. These are the two independent particular 
solutions of the differential equation 


where t = cos À. The Chebyshev functions satisfy the recurrence 
relations 


TO) = AAO) — Tu), 
Un(A) — 2AU,_-1(A) = On Zar); 


whose forms are quite similar to that of equation (10.13). 
Knowing that U2(A) = 2Av1-— à? and U3(A) = (4? — 
1)V1 — 2, it is easy to verify that 


P(Pa, 2A)V1 — A? = Uny (A). (10.14) 


This identity (10.14) is an example of the interesting connections 
that exist between graphs and special functions. 

The general solution of the recurrence relation fn = afn-ı + 
bfn-2 is fy = Ax? + Bat, where A and B are constants and x, 
and a are the roots of the equation z? = ax +b. We apply 
this fact to (10.13). The roots of the equation z? = Ar — 1 are 


A+ VA2—-4 


Daea = After substituting A = 2 cost, we get 


£19 = cost + Vcos*t —1=cost+isint. 


Taking into account the Euler formula cost + isint = e*”, we 
further obtain 


p(P,,2cost) = A (e +B (e*)" 
= Aei + Bert 
= (A+B)cosnt + i(A — B)sinnt. 


The constants A and B are determined from the initial conditions: 


p(Pı,2cost) = 2cost and p(P2,2cost) = 4cos*t — 1. 
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cost 
, SO 
sin t 


Elementary calculation gives A+ B = 1 and i(A — B) = 
that 
sin nt 


sin t 
(cos nt sin t + sin nt cost) 


p(P,,2cost) = cosnt+ cost 


sin t 
sin(n + 1)t 
sint ` 


Consequently, the characteristic polynomial of the path with n 
vertices has the following simple form: 


sin(n + 1)t 
sin t 


pl Pak) = : (10.15) 
with A = 2cost. The spectrum of the path now follows immedi- 
ately from the equation p(P,,A) = 0. The condition sin(n+1)t = 0 
implies (n + 1)t = 7j, that is, 


Aj = 2 cos 2 ers 
n+1 

We now calculate the characteristic polynomial and spectrum 
of the circuit Ch. The graph Cn is a connected graph with n ver- 
tices and n edges. Theorem 8.4.5 from Section 8.5 implies that 
this basic figure effects only the coefficient a, and its contribution 
to an is equal to (—1)'2' = —2. All other basic figures are com- 
posed exclusively of graphs Ky. Therefore, the coefficients of the 
characteristic polynomial of the cycle Cn are determined as a; = bj 
for j = 1,2,...,n — 1 and a, = bn — 2, where 


bog = (—1)"p(Cn, k), k = 1,2,..., and 


Bel. Beide 


This result can be formulated as 


[3] 
p(Cn, A) = —2 + 2 NE (10.16) 
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We now apply Lemma 10.3.2 to the edge ein connecting the 
vertices vı and v„ of the cycle Cn. It is easily seen that Cn — ein = 
P„ whereas Cn — vi — Un = Pa-2. Hence m(Cy, k) = m(P,,k) + 
m(P,-2,k — 1), which when substituted back into (10.16), gives 


Pod = P(Pa )—P(Pr2,)—-2 (1017) 
Using (10.15), we have further 


sin(n+1)t sin(n —1)t 
sin t sin t 


p(Cn, ÀA) = — 2 = 2(cosnt — 1), (10.18) 
where A = 2cost. From (10.18) we can directly determine the 
spectrum of Cn. If p(C,, A) = 0, then nt = 277, and thus 

oe 

Aj = 2.cos I VE 2 
n 

The Chebyshev functions of the first kind and the characteristic 
polynomial of the cycle are related by 


PC Oy oF (So 


Finally we note that the knowledge of the spectrum of the graphs 
P, and Cù is of great importance in the quantum chemical descrip- 
tions of the electronic structure of linear polyenes and annulenes. 
In particular, it is important that the spectrum of the cycle Cn 
possesses (two) zeros if and only if n = 4l (l = 1,2,...). It will 
be explained in the next part of this section that this means that 
the annulenes with 4l carbon atoms have nonbonding molecular 
orbitals and that these compounds are chemically unstable. 


Stability of a Molecule 


We shall discuss here some problems and results of Hückel the- 
ory that can be formulated in a particularly simple way using 
graph-theoretical terminology. 

A molecular orbital W; with the energy FE; = a+ (A, is called 
bonding if A; > 0, antibonding if A; < 0, and nonbonding if A; = 0. 
The electrons in the bonding molecular orbitals strengthen the 
chemical bonds in the molecule, whereas the effect of the elec- 
trons in antibonding orbitals is just the opposite. The electrons 
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whose state is described by non-bonding molecular orbitals play a 
less pronounced role in the creation of chemical bonds. However, 
within the framework of the Hückel theory, it can be demonstrated 
that conjugated hydrocarbons possessing nonbonding molecular 
orbitals are extremely unstable and chemically reactive. The ori- 
gin of this phenomenon cannot be explained here. 

In the theory of conjugated compounds it is quite important 
to establish which systems have nonbonding molecular orbitals. 
Evidently, the number of nonbonding molecular orbitals coincides 
with the multiplicity of 0 in the spectrum of the pertinent Hückel 


graph. Because det A = II Aj, 0 is in the spectrum of a graph if 
j=1 
and only if det A = 0. : 

A general solution of the problem of finding the multiplicity 
of 0 in the spectrum of a graph is not known, but a variety of 
partial results have been obtained. As an illustration we present 
the following two statements (see also [24], Section 8.1). 


Theorem 10.3.4 Assume the graph G has a vertex v, of degree 
1, where v, is adjacent to the vertex vs. Then the graphs G and 
G-v,.—v, have equal multiplicity of the number 0 in their spectra. 


Proof. Let c = [ci, c2, . . . , Cn] be an eigenvector of the adjacency 
matrix A = [a;,] of some graph on n vertices corresponding to the 
eigenvalue 0. Then Ac = 0 and so 


Because apq = 0 when the vertices v, and v, are not adjacent and 
Apg = 1 if these vertices are adjacent, we get 


el ph eet) (10.19) 
Ip 


where the summation is over all q, such that the vertex v,, is 
adjacent to vp. Hence if c is an eigenvector of a graph for eigen- 
value 0, the sum of the components of the vector c “around” each 
vertex vp equals zero. Hence the number of linearly independent 
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eigenvectors with eigenvalue 0, equivalently, the multiplicity of 0 
in the spectrum of a graph, is equal to the number of independent 
components c, in the system of equations (10.19). 

In the graph G, let the vertex v, be adjacent to the vertices 
Ua, Up,--+, Uf in addition to vertex v,. Let the system (10.19) be 
satisfied for the graph G — v, — v, so that the vertices Ug, Up, ..., Uf 
correspond to the components Ca, cy,...,¢¢ of the vector c. 

Now consider the graph G and the system (10.19). Then we 
have to add to the system (10.19) for G — v, — v, two further 
equations: cs = 0 andcg+co,+---+cs+c, = 0. Because of 
the condition c, = 0, all equations (10.19) that were valid for the 
graph G — v, — v, also hold for G. Because cs = 0 and c, = 
— (Ca + cy +++++ cp) are evidently not new independent variables, 
we see that the number of independent components of the vector 
cin the graphs G — v, — vs is the same as that for G. 


Theorem 10.3.5 Assume that the graph G has a path 
Ur, Ua, Ub, Uc, Ud; Us, Us 


where the vertices Va, Up, Ve, Va have degree equal to 2. Let the graph 
G’ be obtained from G by deleting the vertices Ug, Vp, Ve, and va and 
introducing a new edge e,., between the vertices v, and vs. Then G 
and G' have equal multiplicity of the number 0 in their spectra. 


Proof. In order that the system (10.19) be satisfied for the graph 
G, the following equations (among others ) must hold: 


Cr + Ca = 0, Ca + Ce = 0, Ca + ca = O0, Ce + cC = 0. 


Consequently, Ca = Cs and cq = Cr. Therefore, the deletion of 
Va, Ub, Uc, and vg and the simultaneous connection of v, and v, 
cause no change in the number of independent components of the 
vector c. 


Alternant Hydrocarbons and Their Graphs 


In theoretical chemistry, a frequently used concept is that of 
alternant hydrocarbons. A conjugated hydrocarbon is said to be 
alternant if all its atoms can be simultaneously labeled by two 
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labels (usually called star and circle), so that every atom labeled 
by a star has only neighbors labeled by a circle and viceversa. 
Hydrocarbons for which such a labeling is not possible are called 
nonalternant. 

The labeling of the atoms of the molecule by stars and circles 
is equivalent to the coloring of the vertices of the molecular graph 
by two colors (as described in Section 1.1). Therefore alternant 
(respectively, nonalternant) hydrocarbons have bipartite (respec- 
tively, nonbipartite) molecular graphs. 

In Section 8.3 it was proved that the spectrum of a bipartite 
graph is symmetric with respect to zero, that is, A and —A are 
eigenvalues with equal multiplicity. In Hückel theory this result is 
interpreted to mean that the energy levels of the molecular orbitals 
are symmetrically distributed around the energy Eu = a. Conse- 
quently, the orbital with energy E = a+ PA is “paired” with an 
orbital with energy E = a— PA. This is the famous “Pairing the- 
orem” that has a number of important consequences in quantum 
chemistry. 


10.4 Exercises 


1. Assume that the graph G has two vertices v, and vp of degree 
1 that are adjacent to the same vertex v,. Prove that 0 is an 
eigenvalue of G. 


2. Let M be the size of the maximal matching in a tree T on 
n vertices. Show that the multiplicity of the eigenvalue 0 in 
the spectrum of T is equal ton — 2M. 


3. Prove or disprove: The adjacency matrix of a tree T is reg- 
ular if and only if for each vertex v, the forest T — v has 
exactly one component with an odd number of vertices. 


4. Find values of n for which all eigenvalues of the circuit Cn 
are integers.’ 


”Graphs whose spectra consist entirely of integers are called integral graphs. 


Coda 


As remarked in the preface, the graph-theoretical connections with 
matrix theory are numerous, and emphasizing them often leads to 
a clearer and deeper understanding of many of the concepts and 
results of matrix theory. The first systematic use of graphs with 
matrices seems to be by König [55]. We have introduced sev- 
eral (weighted) graphs that can be associated to a matrix—the 
König digraph and the (Coates) digraph of a matrix being the 
most prominent of them. The digraph G(A) of the matrix A is 
called the Konig digraph, because Konig used the corresponding 
bipartite graph in his papers (see [56]). The digraph D*(A) is 
named the Coates digraph and formula (4.1) is called the Coates 
formula, because C.L. Coates introduced them in [13], although 
it is very hard to establish who first came to the idea of such a 
graphical interpretation of a determinant (see the discussion about 
this in Chapter 1 of [24]). F. Harary, referring to C.L. Coates, has 
proposed in [45] that this formula, in a somewhat changed form, 
could be taken as the definition of the determinant. Therefore, 
this definition could be called the Harary—Coates definition. How- 
ever, it seems that Harary’s suggestion from [45] was forgotten, 
and one of the authors of this book later independently came to 
the same idea and, starting from it, outlined the elementary the- 
ory of determinants [22]. A similar development appeared a little 
bit later and again independently [37]. These two digraphs have 
been used to illuminate the basic algebraic properties of matrices, 
including matrix multiplication, determinants, inverses of matri- 
ces, cofactors, and Cramer’s formula, and including solutions of 
linear systems of equations. 

Consideration of the digraph often leads to an easier descrip- 
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tion of many matrix properties. It also suggests different, and 
sometimes more elementary, proofs of important theorems and the 
possibility of generalization. We have given a proof of the classical 
Cayley-Hamilton theorem which illustrates that it is really a the- 
orem about weighted digraphs; this was first noted by Rutherford 
[70] (see also [75], [81], and [7]). We have also seen how a large part 
of the proof of the Jordan canonical form—starting from Jacobi’s 
theorem that a matrix is similar to a triangular matrix—can be 
made graph-theoretical (see [6] and the reference to Turnbull and 
Aitken there). 

The theory of positive, more generally nonnegative, matrices— 
the so-called Perron-Frobenius theory—depends substantially on 
the zero-nonzero pattern of a matrix, and this translates to the 
digraph. For instance, an irreducible matrix becomes a strongly 
connected digraph; for more on this, one may consult [3] and [7]. 
We have seen how properties of the eigenvalues of a nonnegative 
matrix heavily depend on the digraph of the matrix. By use of 
the digraph, we were able to give a substantial generalization of 
the Gersgorin inclusion region for the eigenvaues of a matrix (see 
[7] and [78] for a lot more on this topic). 

The permanent is an important and fundamental combinatorial 
function for which we have given a graphical interpretation. It can 
be used to motivate the important notion of an SNS-matrix. More 
about SNS-matrices and related topics can be found in [10]. 

We have included in the bibliography several books for further 
study and historical information as well as several classical papers 
that influenced the development of graph theory as a tool in ma- 
trix theory. The book [23] (in Serbian) also contains related and 
additional information. 

Finally, we remark that the following four groups of papers, 
from electrical engineering, mathematics, and chemistry, belong 
to the origins of our combinatorial approach to matrix theory. 

1. Electrical engineers have developed a series of methods for 
solving systems of linear algebraic equations, which appear in the 
theory of electrical circuits, control theory, and other areas. These 
methods use flow graphs (Coates [13], [26], [15]), signal flow graphs 
(Mason [59], [60], [80]), and Chan graphs (Chan and Mai [16], 
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[14]); the first two graphs are described in Chapter 6. It is re- 
markable that the mentioned graphs (and, especially, Mason’s sig- 
nal flow graphs) give a better insight into the physical system un- 
der description than the corresponding system of equations does. 
Therefore these graphs were introduced and used intuitively, the 
theoretical background of them often having been given later. The 
terminology used in this book is partially based on the terminology 
used in the literature of electrical engineering. 


2. There are many mathematical papers in which results from 
the matrix theory are obtained or proved by graph-theoretical 
means. The founder of modern graph theory, Hungarian math- 
ematician D. Konig, was the first who used graphical methods in 
matrix theory [55], [56], although even before König there were 
some attempts in this direction (see [64], footnote on p.260, where 
the Cauchy rule for the determination of the sign of a term in the 
development of a determinant is mentioned). See also more recent 
papers that belong to this group ([30], [20], [32], [33]), represent- 
ing only a few examples. It is interesting to note that only papers 
with original and sufficiently nontrivial results obtained by graph 
theory were published, and it was only very recently that a few 
papers were published in which more elementary but more funda- 
mental questions were also interpreted in this way ([22], [37], [38], 
[58]). 

3. In the theory of graph spectra (see, for example, [18], [71], 
and [24]), including its applications to chemistry and to other 
branches of sciences, the results of matrix theory are used for in- 
vestigations of graphs. Although we have here just an inverse 
procedure, compared with that of this book, a great number of 
results contributed to realize how, in the other direction, graphs 
can be used in matrix theory. 


4. Some problems in electrical engineering, and in engineering 
in general, lead to the need of considering systems of linear equa- 
tions whose matrix is sparse and entries are given numerically. 
Special methods of treating such matrices use graph-theoretical 
means to a great extent [11], [4], [76]. 


Answers and Hints 


We give (partial) solutions or hints to a few selected exercises. 


13. 
14. 


Chapter 1 Exercises 


. We have kn = 2e, where e is the number of edges. 


. In forming an even combination of {1,2,...,n}, one has two 


choices for each of 1,2,...,n — 1 (put in the combination of 
leave it out). When one gets to n, there is only 1 choice (put 
n in if an odd number of integers has already been taken; 
leave n out if an even number has been taken). This gives 
2"=1 even combinations. 


Use the fact that 99 is —1 modulo 100. 


A basis consists of the n — 1 vectors 


Erd OSS Oe SO at Oe D 


Chapter 2 Exercises 


. Let A = [a;;| and B = [b;;] be upper triangular matrices of 


order n, so that all entries below the main diagonal equal 
0. If there is an edge with nonzero weight in their König 
digraphs from black vertex i to white vertex j, then i < 
j. Now draw the composition G(A) * G(B) to see that in 
G(A) - G(B) = G(AB) a similar property holds, implying 
that AB is also upper triangular. A similar argument works 
for lower triangular, or now use matrix transposiiton. 
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8. The König digraph of /„(i,j) has edges from black vertex 
i to white vertex j and from black vertex j to white ver- 
tex i. For p # i,j there is also an edge from black vertex 
p to white vertex p. The identities now follow by examin- 
ing the König digraphs G(J,,(2, D = G(In(i,5)): Gnl, 3)) 
and G(In(i, k)In(k, EN) = GUn(t,k)) -© GUn(k, 3) > 
G(In(j, i). 


10. The product equals 
Og | 2% 
-h | Og |’ 
Chapter 3 Exercises 


2. The solution is easily obtained by using the digraphs D(A) 
and D(B) as drawn in Figure I: 


eee - 9-9 
1 2 3 molt 
a a a a a 
{ J1 J1 )ı n Ä J1 ) 
1 2 3 n—1l n 
Figure I 


8. Use the fact that for the permutation matrix P in the defi- 
nition of a circulant of order n, PT = P™t, 
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Chapter 4 Exercises 


1. If n = 2, the matrix A has the form 


0 a42 0 Q14 0 
a1 0 as 0 as 
0 a32 0 a34 0 
ası 0 ag 0 as 
0 a52 0 a54 0 


The only edges (of nonzero weight) in the digraph D*(A) 
join odd numbered vertices to even numbered vertices (a 
directed bipartite graph). Hence no linear subdigraph exists, 
implying that det A = 0. 


Alternatively, one can consider the König digraph G(A) and 
observe that the only edges go from even numbered black 
vertices to odd numbered white vertices, and from odd num- 
bered black vertices to even numbered white vertices. Since 
there are n + 1 odd numbered black vertices and n even 
numbered white vertices, there is no 1-factor with nonzero 
weight; hence det A = 0. 


3. Let Gn = Gyr(a1,G2,...,dn) be the König digraph of the 
given matrix A,,(a@1,@o,...,@n). Let Fy = Fn(a1, ae, -.-, An) 
be the collection of 1-factors of G„(a1,Q2,...,Qn), and let 
Fn,k be the collection of 1-factors with exactly k loops for 
k=0,1,...,n. Then det A„(a1,@s,...,Q,) equals 


k=0 FkEFn k 
n 
= > > Aj, Mig ` * * Ai; > (1) w( Fi), 
k=0 {it ,i2 wu irte{1,2 Mae n} FEF; 


where F; represents the collection of 1-factors of 
Gn-r(0,0,...,0). This implies that det A„ (a1, a2,..., an) 
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equals 
5 det An—(0,0,...,0)eg(a1, a2,- , an), 
k=0 
where ex(a1,@2,...,@n) is the kth elementary symmet- 
ric function of 4aj,d2,...,Qn.- It is easy to show 
that det A,(0,0,...,0) = (-1)'(1 — 1). Therefore 
det An (a1, a2, . . . , an) equals 
NO(=1) mn k — Leg(ar, a2, ..., an). 
k=0 
. Drawing the digraph D*(A) one sees that the only nonzero 


linear subdigraphs are those consisting of a cycle of length 
i + 1 containing the first i + 1 vertices and n — i loops (i = 
0,1,...,n). Such a linear subdigraph contributes 


(=J) ER) (Ta 22 ie 


to the Coates formula for the determinant. Hence 


n 


det A = (-1)" S°(-1)"ajx”* = SY a,x”. 
i=0 


i=0 


. The Coates digraph D*(A) cannot have any linear subdi- 


graphs of nonzero weight; equivalently, the König digraph 
does not have any 1-factors of nonzero weight. 


Observe that AT = —A. 


Chapter 5 Exercises 


. Drawing the digraph of a permutation matrix reveals its in- 


verse. 


. Consider the digraphs D*(A), D*(B), D*(C) corresponding 


to the matrices A, B,C (see Figure II). 
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n n—1 
Q2 
I de — 
k 1 2 
D*(A) D*(B) 


7d ri) Mi) 
1: de 9° 1 Tn 


D*(C) 


Figure II 


Chapter 6 Exercises 


8. The Coates digraph is drawn in Figure III. By inspection we 
get the following solution: 


—A(eug + duh) + B(avh — uhb) 


Az acvh+bfeu-bcuh-afve ’ 
a A(cuh + fve) 

> ~~ acuh + bfeu — beuh — afve’ 
ee A(feu — cuh) 

3 acuh + bfeu — beuh — afve’ 
ee A(fvd + cug) + B(ubf — afv) 


acvh + bfeu — bcuh — a fve ` 
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Figure III 


Chapter 7 Exercises 


5. Consider the cases A = 0 and X Æ 0 separately. 


11. The characteristic polynomial is calculated using the Coates 
digraph of the matrix AJ, — A as drawn in Figure IV: 


Plà) = (-1)” ((-1Par (A ze (ayn tae yn? ie 
+ (yr tage Farar a ar) 


n-1 
= Mena? 5 a? =0. 
i=l 


The eigenvalues are 


1 n—1 
> [oo ar) ‚0 (n — 2 times). 
i=l 
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12. 


11. 


14. 


Figure IV 


The number of different Jordan canonical forms equals the 
number of partitions of the integer 6: 


6;5,1;4,2;4,1,1;3,3;3,2,1;3,1,1,2, 2, 2: 


2 2,1,1;2,1,1,1,1;1,1,1,1,1,1 


and so equals 12. 


Chapter 8 Exercises 


. The exponents are 26 and 25, respectively. 


. An irreducible matrix of order n > 2 must contain at least 


one nonzero entry in each row (and column) and so contains 
at least n nonzero entries. If there are only n nonzero entries, 
then its digraph is a circuit and hence not primitive. 


. The key is that the digraph of A has at least one loop. 


The adjacency matrix has r 1’s in each row and column. 


The graphs given are cospectral. 
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11. 


ANSWERS AND HINTS 


Chapter 9 Exercises 


. The permanent is 2”~1. 


The matrix 


—1 1 0 3 
Seal 1 0 

0 —1 —1i 1 
—1 0 —1 -1 


is an SNS-matrix. 


Chapter 10 Exercises 


. The adjacency matrix has two identical rows and so its de- 


terminant equals 0. 
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